How to Make ChatGPT Go Around in Circles (with GHC and Haskell)

As part of my research for the recently released (and free!) Setup.hs course, I wanted to explore the different kinds of compilation commands you can run with GHC outside the context of a build system.

I wanted to know…

Can I use GHC to compile a Haskell module without its dependent source files?

The answer, obviously, should be yes. When you use Stack or Cabal to get dependencies from Hackage, you aren't downloading and recompiling all the source files for those libraries.

And I eventually managed to do it. It doesn't seem hard once you know the commands already:

$ mkdir bin
$ ghc src/MyStrings.hs src/MyFunction.hs -hidir ./bin -odir ./bin
$ ghc -c app/Main.hs -i./bin -hidir ./bin -odir ./bin
$ ghc bin/Main.o ./bin/MyStrings.o ./bin/MyFunction.o -o ./bin/hello
$ ./bin/hello
...

But, being unfamiliar with the inner workings of GHC, I struggled for a while to find this exact combination of commands, especially with their arguments.

So, like I did last week, I turned to the latest tool in the developer's toolkit: ChatGPT. But once again, everyone's new favorite pair programmer had some struggles of its own on the topic! So let's start by defining exactly the problem we're trying to solve.

The Setup

Let's start with a quick look at our initial file tree.

├── app
│   └── Main.hs
└── src
    ├── MyFunction.hs
    └── MyStrings.hs

This is meant to look the way I would organize my code in a Stack project. We have two "library" modules in the src directory, and one executable module in the app directory that will depend on the library modules. These files are all very simple:

-- src/MyStrings.hs
module MyStrings where

greeting :: String
greeting = "Hello"

-- src/MyFunction.hs
module MyFunction where

modifyString :: String -> String
modifyString x = base <> " " <> base
  where
    base = tail x <> [head x]

-- app/Main.hs
module Main where

import MyStrings (greeting)
import MyFunction (modifyString)

main :: IO ()
main = putStrLn (modifyString greeting)

Our goal is to compile and run the executable with two constraints:

1. Use only GHC (no Stack or Cabal involved)

2. Compile the library separately, so that the Main module could be compiled using only the library's build artifacts, and not the source files.

Trying to Compile

Now, there are two easy ways to compile this code if we're willing to violate our constraints (particularly the second one). If all three files are in the same directory, GHC can immediately find the modules we're importing, so we can just call ghc Main.hs.

└── src
    ├── Main.hs
    ├── MyFunction.hs
    └── MyStrings.hs

...

$ cd src
$ ghc Main.hs
$ ./Main
elloH elloH

If we keep the original file structure, it's still pretty easy if we're willing to violate constraint 2 by using the library source files in the ghc command:

$ ghc app/Main.hs src/MyStrings.hs src/MyFunction.hs
$ ./Main
elloH elloH

However, with my extra constraint of not using the library source files directly, things got a little more difficult. I originally thought "Hey, let's just include the build artifacts in the GHC call." Each library module generates an interface file (.hi) and an object file (.o). So let's try including those in the GHC call:

$ ghc src/MyStrings.hs src/MyFunction.hs
$ ls -l src/
MyFunction.hi MyFunction.hs MyFunction.o MyStrings.hi MyStrings.hs MyStrings.o
$ ghc app/Main.hs src/MyFunction.hi src/MyFunction.o src/MyStrings.hi src/MyStrings.o
...
Could not find module 'MyStrings'
...
Could not find module 'MyFunction'

As we can see, it doesn't work! Even though we're including the build artifacts as direct arguments of ghc, this doesn't help our main module to find them.

So, feeling a little too lazy to dig deep through GHC documentation, and feeling like this problem might have too many arbitrary constraints to phrase in a Google query, I decided to try our aforementioned new favorite tool.

Enter: ChatGPT

The First Session

(Conversation has been abbreviated for relevance)

Me

How do I link existing .o files with a new source module using the 'ghc' command?

Chat GPT

To link existing object (.o) files with a new source module using the ghc command, you need to include the object files in the list of input files passed to the ghc command.

Assuming you have an existing .o file called example.o and a new source module called MyModule.hs, you can link them together into an executable program using the following command:

$ ghc -o MyProgram MyModule.hs example.o

This command tells ghc to compile and link the MyModule.hs source file with the example.o object file and create an executable program called MyProgram. The -o flag specifies the output file name.

Me

Is it possible to use a ghc command to link a Haskell object file (.o) with a Haskell source file that imports the other module?

Chat GPT

Yes, it is possible to use the ghc command to link a Haskell object file (.o) with a Haskell source file that imports the other module. Here's an example:

Suppose you have two files, MyModule.hs and OtherModule.o, and MyModule.hs imports a module that is defined in OtherModule.o. To link the two files into an executable program, you can use the following command:

$ ghc -o MyProgram MyModule.hs OtherModule.o

This command tells ghc to compile and link the MyModule.hs source file with the OtherModule.o object file, and create an executable program called MyProgram.

Me

Hmmm I don't think this works. I created the following file OtherModule.hs:

module OtherModule where

greeting :: String
greeting = "Hello"

And I'm trying to include it in another file:

module Main where

import OtherModule (greeting)

main :: IO ()
main = putStrLn greeting

I started by compiling OtherModule:

$ ghc src/OtherModule.hs

Then I moved OtherModule.o to the app directory (with MyModule.hs), and tried running this command:

$ cd app
$ ghc -o MyProgram MyModule.hs OtherModule.o

I get the error:

"Could not find module 'OtherModule'"

Chat GPT

I apologize for the confusion in my previous answer. You are correct that simply linking the object file with the executable does not work in this case.

When GHC compiles a module, it generates an interface file (.hi) that contains information about the module's name, type signatures, and other details. The compiled object file (.o) contains the compiled machine code for the functions defined in the module.

When you try to import a module in a source file, GHC looks for the interface file (OtherModule.hi) to find information about the module's name and type signatures. If the interface file is not found, GHC will raise an error.

In your case, when you compile OtherModule.hs, GHC generates an interface file called OtherModule.hi. To compile MyModule.hs, GHC needs to be able to find OtherModule.hi in order to resolve the import of OtherModule. Simply linking the OtherModule.o object file is not enough.

To make this work, you need to compile OtherModule.hs and MyModule.hs together, like this:

ghc -o MyProgram MyModule.hs src/OtherModule.hs

This command tells GHC to compile MyModule.hs and OtherModule.hs together, and create an executable program called MyProgram.

I hope this clears up the confusion. Again, I apologize for the confusion in my previous answer.

Me

Ah, so I need to include both source files, and it can't be done by linking an object file with another source file?

Chat GPT

Yes, that's correct. When you import a module in a source file, GHC needs to be able to find the interface file (.hi) for that module in order to resolve the import. If the interface file is not found, GHC will raise an error.

When you compile a source file with GHC, it generates both an object file (.o) and an interface file (.hi). The object file contains the compiled machine code for the functions defined in the module, while the interface file contains information about the module's name, type signatures, and other details.

When you compile a source file that imports another module, GHC needs to be able to find the interface file for that module in order to compile the importing module. If the interface file is not found, GHC will raise an error.

So in your case, you need to compile both MyModule.hs and OtherModule.hs together so that GHC can generate the necessary interface files and link the resulting object files together to produce the executable program.

It is not possible to directly link an object file with a source file that imports it, as GHC needs the interface file to be present in order to compile the importing module.

Analysis

Suspiciously, ChatGPT suggests a less detailed command than the previous one that failed for me!

$ ghc -o MyProgram MyModule.hs OtherModule.o

It includes only the object file, instead of the object and interface file. And, of course this fails, like I showed above.

Now there's an interesting statement that will come into play later, though I didn't realize it!

Yes, that's correct. When you import a module in a source file, GHC needs to be able to find the interface file (.hi) for that module in order to resolve the import. If the interface file is not found, GHC will raise an error.

I assumed that to "find" the interface file meant providing it on the command line (as in my original command). So with that idea having failed, I fixated on one part of the final statement:

It is not possible to directly link an object file with a source file that imports it...

And concluded that getting the right information from ChatGPT might be a lost cause. It seemed like it was sure this was possible, and then just as sure that it was impossible. But details matter, as we'll see!

Working Around It

So feeling slightly defeated, I decided to poke around the areas of the code where Cabal actually saves packages to see if there was a pattern I could follow. And I managed to cobble together a convoluted workaround.

What I observed was that GHC could use any module from a library in the Cabal package database. On my system's setup, this could be found in the .ghcup cache:

$ cd ~/.ghcup/ghc/9.2.5/lib/ghc-9.2.5/
$ ls
array-0.5.4.0/
base-4.16.4.0/
...
package.conf.d/
$ ls package.conf.d
array-0.5.4.0.conf
base-4.16.4.0.conf
...

And indeed, the package directories did not have source files, only various forms of compiled object and interface files. And each package had a corresponding .conf file in the package.conf.d directory.

So I could...

  1. Make my own directory in the package database (MyPackage-0.1.0.0)
  2. Place the .hi and .o files in there
  3. Create a phony MyPackage-0.1.0.0.conf and put it in package.conf.d. (Base it on an existing .conf file but make some substitutions)

And this...technically...works. But it makes me feel a little bit like:

But manually manipulating a package database wasn't the solution I had in mind though. Surely there's something easier, right?

The Second Session

After a week of sitting on this, I decided to come back to ChatGPT and see if it had any other ideas. I decided to frame my question in a different way, stating a lot more details up front. Lo and behold, it had a different answer this time!

Me

Suppose I have a Haskell module called MyStrings:

module MyStrings where

greeting :: String
greeting = "Hello"

I can compile this module:

$ ghc MyStrings.hs

This generates an interface file, MyStrings.hi, and an object file, MyStrings.o.

Now suppose I have an executable file, Main.hs, that imports MyStrings:

module Main where

import MyStrings (greeting)

main :: IO ()
main = putStrLn greeting

Can I use ghc to compile this Main file by using the .hi and .o files, without including the imported source file, MyStrings.hs in the command?

Chat

Yes, you can use the .hi and .o files generated by compiling MyStrings.hs to compile Main.hs without including MyStrings.hs in the command. To do so, you can use the -c option to tell GHC to compile Main.hs to an object file, and then use the -o option to link the object files together into an executable:

$ ghc -c Main.hs
$ ghc Main.o MyStrings.o -o myprogram

The first command compiles Main.hs to an object file Main.o, without attempting to link it. The second command links Main.o with MyStrings.o and creates an executable called myprogram. The .hi files are not needed for linking, so you don't need to include them in the command.

Note that in order for this to work, the interface file MyStrings.hi must be present in the same directory as Main.hs when you compile it. If you move MyStrings.hi to a different directory, you'll need to use the -i option to tell GHC where to find it:

$ ghc -i/path/to/MyStrings -c Main.hs
$ ghc Main.o /path/to/MyStrings/MyStrings.o -o myprogram

Here, -i/path/to/MyStrings tells GHC to look for MyStrings.hi in the directory /path/to/MyStrings.

Analysis

So there are two key ingredients that weren't mentioned at all in our first pass: the -c option and the -i option.

Using -c performs compilation without linking. With this option, GHC is able to produce an object file for our Main module without needing its dependent objects. So .o files are unnecessary in this stage.

We still need the .hi files though. But instead of providing them on the command line, we use the -i argument to supply them. It's an odd argument, because we put the path right after the i, without any spacing.

After we're done with the first phase, then we can link all our object files together.

Solving It

And sure enough, this approach works!

$ ghc src/MyStrings.hs src/MyFunction.hs
$ ghc -c app/Main.hs -i./src
$ ghc app/Main.o ./src/MyStrings.o ./src/MyFunction.o -o hello
$ ./hello
elloH elloH

And if we want to be a little cleaner about putting our artifacts in a single location, we can use the -hidir and -odir arguments for storing everything in a bin directory.

$ mkdir bin
$ ghc src/MyStrings.hs src/MyFunction.hs -hidir ./bin -odir ./bin
$ ghc -c app/Main.hs -i./bin -hidir ./bin -odir ./bin
$ ghc bin/Main.o ./bin/MyStrings.o ./bin/MyFunction.o -o ./bin/hello
$ ./bin/hello
elloH elloH

And we're done! Our program is compiling as we wanted it to, without our "Main" compilation command directly using the library source files.

Conclusion

So with that fun little adventure concluded, what can we learn from this? Well first of all, prompts matter a great deal when you're using a Chatbot. The more detailed your prompt, and the more you spell out your assumptions, the more likely you'll get the answer you're looking for. My second prompt was waaay more detailed than my first prompt, and the solution was much better as a result.

But a more pertinent lesson for Haskellers might be that using GHC by itself can be a big pain. So if you're a beginner, you might be asking:

What's the normal way to build Haskell Code?

You can learn all about building and running your Haskell code in our new free course, Setup.hs. This course will teach you the easy steps to set up your Haskell toolchain, and show you how to build and run your code using Stack, Haskell's most popular build system. You'll even learn how to get Haskell integrations in several popular code editors so you can learn from your mistakes much more quickly. Learn more about it on the course page.

And if you subscribe to our monthly newsletter, you'll get a code for 20% off any of our paid courses until May 1st! So don't miss out on that offer!

Previous
Previous

This is How to Build Haskell with GNU Make (and why it's worth trying)

Next
Next

How to Stump ChatGPT in 1 Easy Step (Hint: Ask a Windows Question)