Haskell return lazy string from file IO


Here I'm back again with a (for me) really strange behaviour of my newest masterpiece...

This code should read a file, but it doesn't:

readCsvContents :: String -> IO ( String )
readCsvContents fileName = do
     withFile fileName ReadMode (\handle -> do
          contents <- hGetContents handle
          return contents

main = do
    contents <- readCsvContents "src\\EURUSD60.csv"
    putStrLn ("Read " ++ show (length contents) ++ " Bytes input data.")

The result is

Read 0 Bytes input data.

Now I changed the first function and added a putStrLn:

readCsvContents :: String -> IO ( String )
readCsvContents fileName = do
     withFile fileName ReadMode (\handle -> do
          contents <- hGetContents handle
          putStrLn ("hGetContents gave " ++ show (length contents) ++ " Bytes of input data.")
          return contents

and the result is

hGetContents gave 3479360 Bytes of input data.
Read 3479360 Bytes input data.

WTF ??? Well, I know, Haskell is lazy. But I didn't know I had to kick it in the butt like this.


You're right, this is a pain. Avoid using the old standard file IO module, for this reason – except to simply read an entire file that won't change, as you did; this can be done just fine with readFile.

readCsvContents :: Filepath -> IO String
readCsvContents fileName = do
   contents <- readFile fileName
   return contents

Note that, by the monad laws, this is exactly the same1 as

readCsvContents = readFile

The problem with what you tried is that the handle is closed unconditionally when the monad exits withFile, without checking whether lazy-evaluation of contents has actually forced the file reads. That is of course horrible; I would never bother to use handles myself. readFile avoids the problem by linking the closing of the handle to garbage-collection of the original result thunk; this isn't altogether nice either but often works quite well.

For proper work with file IO, check out either the conduit or pipes library. The former focuses a bit more on performance, the latter more on elegance (but really, the difference isn't that big).

1And your first try is the same as readCsvContents fn = withFile fn ReadMode hGetContents.


