When Strings get Word-y
Earlier this week we explored the lines
and unlines
functions which allow us to split up strings based on where the newline characters are and then glue them back together if we want. But a lot of times the newline is the character you're worried about, it's actually just a normal space! How can we take a string that is a single line and then separate it?
Again, we have very simple functions to turn to: words
and unwords
:
words :: String -> [String]
unwords :: [String] -> String
These do exactly what you think they do. We can take a string with several words in it and break it up into a list that excludes all whitespace.
>> words "Hello there my friend"
["Hello", "there", "my", "friend"]
And then we can recombine that list into a single string using unwords
:
>> unwords ["Hello", "there", "my", "friend"]
"Hello there my friend"
This pattern is actually quite helpful when solving problems of the kind you'll see on Hackerrank. Your input will often have a particular format like, "the first line has 'n' and 'k' which are the number of cases and the case size, and then there are 'n' lines with 'k' elements in them." So this might look like:
5 3
1 2 3
4 5 6
7 8 9
10 11 12
13 14 15
Then words
is very useful for splitting each line. For example:
readFirstLine :: IO (Int, Int)
readFirstLine = do
firstNumbers <- words <$> getLine
case firstNumbers of
(n : k : _) -> return (n, k)
_ -> error "Invalid input!"
As with lines
and unlines
, these functions aren't inverses, properly speaking. With unwords
, the separator will always be a single space character. However, if there are multiple spaces in the original string, words
will work in the same way.
>> words "Hello there my friend"
["Hello", "there", "my", "friend"]
>> unwords (words "Hello there my friend")
"Hello there my friend"
In fact, words
will separate based on any whitespace characters, including tabs and even newlines!
>> words "Hello \t there \n my \n\t friend"
["Hello", "there", "my", "friend"]
So it could actually do the same job as lines
, but then reversing the process would require unlines
rather than `unwords.
Finally, it's interesting to note that unlines
and unwords
are both really just variations of intercalate
, which we learned about last month.
unlines :: [String] -> String
unlines = intercalate "\n"
unwords :: [String] -> String
unwords = intercalate " "
For more tips on string manipulation in Haskell, make sure to come back next week! We'll start exploring the different String types that exist in Haskell! If you're interested in problem solving in Haskell, you should try our free Recursion Workbook! It explains all the ins and outs of recursion and includes 10 different practice problems!