Finding What You Seek
In our last couple of articles, we've gone through the basics of how to use Handles. This useful abstraction not only lets us access files in a stateful way, but also to treat terminal streams (standard in, standard out) in the same way we treat files. This week we'll learn a few tricks that are a little more specific to handles on files. File handles are seekable, meaning we can move around where we are "pointing" to in the file, similar to moving the position of a video recording.
To understand how this works, we should first make a note, if it isn't clear already, that a Handle
is a stateful object. The handle points to the file, but it also tracks information about where it is in the file. For example, let's define a file:
First Line
Second Line
Third Line
Fourth Line
...
We can have two different functions that will print from a handle. One will print a single line, the other will print two lines.
printOneLine :: Handle -> IO ()
printOneLine h = hGetLine h >>= putStrLn
printTwoLines :: Handle -> IO ()
printTwoLines h = do
hGetLine h >>= putStrLn
hGetLine h >>= putStrLn
If we call these back to back on our file, it will print all three lines of the file, rather than re-printing the first line.
main :: IO ()
main = do
h <- openFile "testfile.txt" ReadMode
printOneLine h
printTwoLines h
hClose h
...
>> stack exec io-program
First Line
Second Line
Third Line
This is because the state of h
carries over after the first call. So the handle remembers that it is now pointing at the second line.
Now you might wonder, if this computation is stateful, why doesn't it use the State
monad? It turns out the IO
monad already is its own "state" monad. However, the "state" in this case is the state of the whole operating system! Or we can even think of IO as tracking the state of "the whole outside world". This is why IO is so impure, because the "state of the whole outside world" changes for every single call!
We can illustrate most plainly how the state has changed by printing the position of the handle. This is accessible through the function hGetPosn
, which gives us an item of type HandlePosn
. We can also use hTell
to give us this value as an integer.
hGetPosn :: Handle -> IO HandlePosn
hTell :: Handle -> IO Integer
Let's see the position at different points in our program.
main :: IO ()
main = do
h <- openFile "testfile.txt" ReadMode
hGetPosn h >>= print
hTell h >>= print
printOneLine h
hGetPosn h >>= print
hTell h >>= print
printTwoLines h
hGetPosn h >>= print
hTell h >>= print
hClose h
...
>> stack exec io-program
{handle: testfile.txt} at position 0
0
First Line
{handle: testfile.txt} at position 11
11
Second Line
Third Line
{handle: testfile.txt} at position 34
34
We can manipulate this position in a number of different ways, but they all depend on the file being seekable. By and large, read and write file handles are seekable, while the terminal handles are not. As we'll see, "append" handles are also not seekable.
hIsSeekable :: IO Bool
main :: IO ()
main = do
hIsSeekable stdin >>= print
hIsSeekable stdout >>= print
h <- openFile "testfile.txt" ReadMode
hIsSeekable h >>= print
hClose h
...
>> stack exec io-program
False
False
True
Note: you'll get an error if you even query a closed handle for whether or not it is seekable.
So how do we change the position? The first way is through hSetPosn
.
hSetPosn :: HandlePosn -> IO ()
This lets us go back to a previous position we saw. So in this example, we'll read one line and save that position. Then we'll read two more lines, go back to the previous position, and read one line again. Because the HandlePosn
object relates both to the numeric position AND the specific handle, we don't need to specify the Handle
again in the function call.
main :: IO ()
main = do
h <- openFile "testfile.txt" ReadMode
printOneLine h
p <- hGetPosn h
printTwoLines h
hSetPosn p
printOneLine h
hClose h
...
>> stack exec io-program
First Line
Second Line
Third Line
Second Line
We can do various tricks with hSeek
, which takes the handle and an integer position. It also takes a SeekMode
. This tells us if the integer refers to an "absolute" position in the file, a position "relative" to the current position, or even a position relative to the end.
data SeekMode =
AbsoluteSeek |
RelativeSeek |
SeekFromEnd
hSeek :: Handle -> SeekMode -> Integer -> IO ()
In this example we'll read the first line, advance the seek position a few characters (which will cut off what we see of the second line), and then go back to the start.
main :: IO ()
main = do
h <- openFile "testfile.txt" ReadMode
printOneLine h
hSeek h RelativeSeek 4
printTwoLines h
hSeek h AbsoluteSeek 0
printOneLine h
hClose h
>> stack exec io-program
First Line
nd Line
Third Line
Second Line
We can also seek when writing to a file. As always with WriteMode
, there's a gotcha. In this example, we'll write our first line, go back to the start, write another line, and then write a final line at the end.
main :: IO ()
main = do
h <- openFile "testfile.txt" WriteMode
hPutStrLn h "The First Line"
hSeek h AbsoluteSeek 0
hPutStrLn h "Second Line"
hSeek h SeekFromEnd 0
hPutStrLn h "Third Line"
hClose h
The result file is a little confusing though!
Second Line
ne
Third Line
We overwrote most of the first line we wrote, instead of appending at the beginning! All that's left of "The First Line" is the "ne" and newline character!
We might hope to fix this by using AppendMode
, but it doesn't work! This mode makes the assumption that you are only writing new information to the end of a file. Therefore, append handles are not seekable.
If you're just writing application level code, you probably don't need to worry too often about these subtleties. But if you have any desire to write a low-level library, you'll need to know about all these specific mechanics! Stay tuned for more IO-related content in the coming weeks. If you want to stay up to date, make sure to subscribe to our monthly newsletter! You'll get access to our subscriber resources, which includes a lot of great beginner materials!