When Strings get Word-y

Earlier this week we explored the lines and unlines functions which allow us to split up strings based on where the newline characters are and then glue them back together if we want. But a lot of times the newline is the character you're worried about, it's actually just a normal space! How can we take a string that is a single line and then separate it?

Again, we have very simple functions to turn to: words and unwords:

words :: String -> [String]

unwords :: [String] -> String

These do exactly what you think they do. We can take a string with several words in it and break it up into a list that excludes all whitespace.

>> words "Hello there my friend"
["Hello", "there", "my", "friend"]

And then we can recombine that list into a single string using unwords:

>> unwords ["Hello", "there", "my", "friend"]
"Hello there my friend"

This pattern is actually quite helpful when solving problems of the kind you'll see on Hackerrank. Your input will often have a particular format like, "the first line has 'n' and 'k' which are the number of cases and the case size, and then there are 'n' lines with 'k' elements in them." So this might look like:

5 3
1 2 3
4 5 6
7 8 9
10 11 12
13 14 15

Then words is very useful for splitting each line. For example:

readFirstLine :: IO (Int, Int)
readFirstLine = do
  firstNumbers <- words <$> getLine
  case firstNumbers of
    (n : k : _) -> return (n, k)
    _ -> error "Invalid input!"

As with lines and unlines, these functions aren't inverses, properly speaking. With unwords, the separator will always be a single space character. However, if there are multiple spaces in the original string, words will work in the same way.

>> words "Hello    there    my    friend"
["Hello", "there", "my", "friend"]
>> unwords (words "Hello    there    my     friend")
"Hello there my friend"

In fact, words will separate based on any whitespace characters, including tabs and even newlines!

>> words "Hello \t  there \n my \n\t friend"
["Hello", "there", "my", "friend"]

So it could actually do the same job as lines, but then reversing the process would require unlines rather than `unwords.

Finally, it's interesting to note that unlines and unwords are both really just variations of intercalate, which we learned about last month.

unlines :: [String] -> String
unlines = intercalate "\n"

unwords :: [String] -> String
unwords = intercalate " "

For more tips on string manipulation in Haskell, make sure to come back next week! We'll start exploring the different String types that exist in Haskell! If you're interested in problem solving in Haskell, you should try our free Recursion Workbook! It explains all the ins and outs of recursion and includes 10 different practice problems!

Previous
Previous

Con-Text-ualizing Strings

Next
Next

Line 'em Up!