James Bowen 4/17/23 James Bowen 4/17/23

This is How to Build Haskell with GNU Make (and why it's worth trying)

In a previous article I showed the GHC commands you need to compile a basic Haskell executable without explicitly using the source files from its dependencies. But when you're writing your own Haskell code, 99% of the time you want to be using a Haskell build system like Stack or Cabal for your compilation needs instead of writing your own GHC commands. (And you can learn how to use Stack in my new free course, Setup.hs).

But part of my motivation for solving that problem was that I wanted to try an interesting experiment:

How can I build my Haskell code using GNU Make?

GNU Make is a generic build system that allows you to specify components of your project, map out their dependencies, and dictate how your build artifacts are generated and run.

I wanted to structure my source code the same way I would in a Cabal-style application, but rely on GNU Make to chain together the necessary GHC compilation commands. I did this to help gain a deeper understanding of how a Haskell build system could work under the hood.

In a Haskell project, we map out our project structure in the .cabal file. When we use GNU Make, our project is mapped out in the makefile. Here's the Makefile we'll ultimately be constructing:

GHC = ~/.ghcup/ghc/9.2.5/bin/ghc
BIN = ./bin
EXE = ${BIN}/hello

LIB_DIR = ${BIN}/lib
SRCS = $(wildcard src/*.hs)
LIB_OBJS = $(wildcard ${LIB_DIR}/*.o)

library: ${SRCS}
  @mkdir -p ${LIB_DIR}
  @${GHC} ${SRCS} -hidir ${LIB_DIR} -odir ${LIB_DIR}

generate_run: app/Main.hs library
  @mkdir -p ${BIN}
  @cp ${LIB_DIR}/*.hi ${BIN}
  @${GHC} -i${BIN} -c app/Main.hs -hidir ${BIN} -odir ${BIN}
  @${GHC} ${BIN}/Main.o ${LIB_OBJS} -o ${EXE}

run: generate_run
  @${EXE}

TEST_DIR = ${BIN}/test
TEST_EXE = ${TEST_DIR}/run_test

generate_test: test/Spec.hs library
  @mkdir -p ${TEST_DIR}
  @cp ${LIB_DIR}/*.hi ${TEST_DIR}
  @${GHC} -i${TEST_DIR} -c test/Spec.hs -hidir ${TEST_DIR} -odir ${TEST_DIR}
  @${GHC} ${TEST_DIR}/Main.o ${LIB_OBJS} -o ${TEST_EXE}

test: generate_test
  @${TEST_EXE}

clean:
  rm -rf ./bin

Over the course of this article, we'll build up this solution piece-by-piece. But first, let's understand exactly what Haskell code we're trying to build.

Our Source Code

We want to lay out our files like this, separating our source code (/src directory), from our executable code (/app) and our testing code (/test):

.
├── app
│   └── Main.hs
├── makefile
├── src
│   ├── MyFunction.hs
│   └── TryStrings.hs
└── test
    └── Spec.hs

Here's the source code for our three primary files:

-- src/MyStrings.hs
module MyStrings where

greeting :: String
greeting = "Hello"

-- src/MyFunction.hs
module MyFunction where

modifyString :: String -> String
modifyString x = base <> " " <> base
  where
    base = tail x <> [head x]

-- app/Main.hs
module Main where

import MyStrings (greeting)
import MyFunction (modifyString)

main :: IO ()
main = putStrLn (modifyString greeting)

And here's what our simple "Spec" test looks like. It doesn't use a testing library, it just prints different messages depending on whether or not we get the expected output from modifyString.

-- test/Spec.hs
module Main where

import MyFunction (modifyString)

main :: IO ()
main = do
  test "abcd" "bcda bcda"
  test "Hello" "elloH elloH"

test :: String -> String -> IO ()
test input expected = do
  let actual = modifyString input
  putStr $ "Testing case: " <> input <> ": "
  if expected /= actual
    then putStrLn $ "Incorrect result! Expected: " <> expected <> " Actual: " <> actual
    else putStrLn "Correct!"

The files are laid out the way we would expect for a basic Haskell application. We have our "library" code in the src directory. We have a single "executable" in the app directory. And we have a single "test suite" in the test directory. Instead of having a Project.cabal file at the root of our project, we'll have our makefile. (At the end, we'll actually compare our Makefile with an equivalent .cabal file).

But what does the Makefile look like? Well it would be overwhelming to construct it all at once. Let's begin slowly by treating our executable as a single file application.

Running a Single File Application

So for now, let's adjust Main.hs so it's an independent file without any dependencies on our library modules:

-- app/Main.hs
module Main where

main :: IO ()
main = putStrLn "Hello"

The simplest way to run this file is runghc. So let's create our first makefile rule that will do this. A rule has a name, a set of prerequisites, and then a set of commands to run. We'll call our rule run, and have it use runghc on app/Main.hs. We'll also include the app/Main.hs as a prerequisite, since the rule will run differently if that file changes.

run: app/Main.hs
  runghc app/Main.hs

And now we can run this run using make run, and it will work!

$ make run
runghc app/Main.hs
Hello

Notice that it prints the command we're running. We can change this by using the @ symbol in front of the command in our Makefile. We'll do this with almost all our commands:

run: app/Main.hs
  @runghc app/Main.hs

And it now runs our application without printing the command.

Using runghc is convenient, but if we want to use dependencies from different directories, we'll eventually need to use multiple stages of compilation. So we'll want to create two distinct rules. One that generates the executable using ghc, and another that actually runs the generated executable.

So let's create a generate_run rule that will produce the build artifacts, and then run will use them.

generate_run: app/Main.hs
  @ghc app/Main.hs

run: generate_run
  @./app/Main

Notice that run can depend on generate_run as a prerequisite, instead of the source file now. This also generates three build artifacts directly in our app directory: the interface file Main.hi, the object file Main.o, and the executable Main.

It's bad practice to mix build artifacts with source files, so let's use GHC's arguments (-hidir, -odir and -o) to store these artifacts in their own directory called bin.

generate_run: app/Main.hs
  @mkdir -p ./bin
  @ghc app/Main.hs -hidir ./bin -odir ./bin -o ./bin/hello

run: generate_run
  @./bin/hello

We can then add a third rule to "clean" our project. This would remove all binary files so that we can do a fresh recompilation if we want.

clean:
  rm -rf ./bin

For one final flourish in this section, we can use some variables. We can make one for the GHC compiler, referencing its absolute path instead of a symlink. This would make it easy to switch out the version if we wanted. We'll also add a variable for our bin directory and the hello executable, since these are used multiple times.

# Could easily switch versions if desired
# e.g. GHC = ~/.ghcup/ghc/9.4.4/bin/ghc
GHC = ~/.ghcup/ghc/9.2.5/bin/ghc
BIN = ./bin
EXE = ${BIN}/hello

generate_run: app/Main.hs
  @mkdir -p ${BIN}
  @${GHC} app/Main.hs -hidir ${BIN} -odir ${BIN} -o ${EXE}

run: generate_run
  @${EXE}

clean:
  rm -rf ./bin

And all this still works as expected!

$ generate_run
[1 of 1] Compiling Main (app/Main.hs, bin/Main.o)
Linking ./bin/hello
$ make run
Hello
$ make clean
rm -rf ./bin

So we have some basic rules for our executable. But remember our goal is to depend on a library. So let's add a new rule to generate the library objects.

Generating a Library

For this step, we would like to compile src/MyStrings.hs and src/MyFunction.hs. Each of these will generate an interface file (.hi) and an object file (.o). We want to place these artifacts in a specific library directory within our bin folder.

We'll do this by means of a new rule, library, which will use our two source files as its prerequisites. It will start by creating the library artifacts directory:

LIB_DIR = ${BIN}/lib

library: src/MyStrings.hs src/MyFunction.hs
  @mkdir -p ${LIB_DIR}
  ...

But now the only thing we have to do is use GHC on both of our source files, using LIB_DIR as the destination point.

LIB_DIR = ${BIN}/lib

library: src/MyStrings.hs src/MyFunction.hs
  @mkdir -p ${LIB_DIR}
  @ghc src/MyStrings.hs src/MyFunction.hs -hidir ${LIB_DIR} -odir ${LIB_DIR}

Now when we run the target, we'll see that it produces the desired files:

$ make library
$ ls ./bin/lib
MyFunction.hi MyFunction.o MyStrings.hi MyStrings.o

Right now though, if we added a new source file, we'd have to modify the rule in two places. We can fix this by adding a variable that uses wildcard to match all our source files in the directory (src/*.hs).

LIB_DIR = ${BIN}/lib
SRCS = $(wildcard src/*.hs)

library: ${SRCS}
  @mkdir -p ${LIB_DIR}
  @${GHC} ${SRCS} -hidir ${LIB_DIR} -odir ${LIB_DIR}

While we're learning about wildcard, let's make another variable to capture all the produced object files. We'll use this in the next section.

LIB_OBJS = $(wildcard ${LIB_DIR}/*.o)

So great! We're producing our library artifacts. How do we use them?

Linking the Library

In this section, we'll link our library code with our executable. We'll begin by assuming our Main file has gone back to its original form with imports, instead of the simplified form:

-- app/Main.hs
module Main where

import MyStrings (greeting)
import MyFunction (modifyString)

main :: IO ()
main = putStrLn (modifyString greeting)

We when try to generate_run, compilation fails because it cannot find the modules we're trying to import:

$ make generate_run
...
Could not find module 'MyStrings'
...
Could not find module 'MyFunction'

As we went over in the previous article, the general approach to compiling the Main module with its dependencies has two steps:

1. Compile with the -c option (to stop before the linking stage) using -i to point to a directory containing the interface files.

2. Compile the generated Main.o object file together with the library .o files to produce the executable.

So we'll be modifying our generate_main rule with some extra steps. First of course, it must now depend on the library rule. Then our first new command will be to copy the .hi files from the lib directory into the top-level bin directory.

generate_run: app/Main.hs library
  @mkdir -p ./bin
  @cp ${LIB_DIR}/*.hi ${BIN}
  ...

We could have avoided this step by generating the library artifacts in bin directly. I wanted to have a separate location for all of them though. And while there may be some way to direct the next command to find the headers in the lib directory, none of the obvious ways worked for me.

Regardless, our next step will be to modify the ghc call in this rule to use the -c and -i arguments. The rest stays the same:

generate_run: app/Main.hs library
  @mkdir -p ./bin
  @cp ${LIB_DIR}/*.hi ${BIN}
  @${GHC} -i${BIN} -c app/Main.hs -hidir ${BIN} -odir ${BIN}
  ...

Finally, we invoke our final ghc call, linking the .o files together. At the command line, this would look like:

$ ghc ./bin/Main.o ./bin/lib/MyStrings.o ./bin/lib/MyFunction.o -o ./bin/hello

Recalling our LIB_OBJS variable from up above, we can fill in the rule in our Makefile like so:

LIB_OBJS = $(wildcard ${LIB_DIR}/*.o)

generate_run: app/Main.hs library
  @mkdir -p ./bin
  @cp ${LIB_DIR}/*.hi ${BIN}
  @${GHC} -i${BIN} -c app/Main.hs -hidir ${BIN} -odir ${BIN}
  @${GHC} ${BIN}/Main.o ${LIB_OBJS} -o ${EXE}

And now our program will work as expected! We can clean it and jump straight to the make run rule, since this will run its prerequisites make library and make generate_run automatically.

$ make clean
rm -rf ./bin
$ make run
[1 of 2] Compiling MyFunction (src/MyFunction.hs, bin/lib/MyFunction.o)
[2 of 2] Compiling MyStrings (src/MyStrings.hs, bin/lib/MyStrings.o)
elloH elloH

So we've covered the library and an executable, but most Haskell projects have at least one test suite. So how would we implement that?

Adding a Test Suite

Well, a test suite is basically just a special executable. So we'll make another pair of rules, generate_test and test, that will mimic generate_run and run. Very little changes, except that we'll make another special directory within bin for our test artifacts.

TEST_DIR = ${BIN}/test
TEST_EXE = ${TEST_DIR}/run_test

generate_test: test/Spec.hs library
  @mkdir -p ${TEST_DIR}
  @cp ${LIB_DIR}/*.hi ${TEST_DIR}
  @${GHC} -i${TEST_DIR} -c test/Spec.hs -hidir ${TEST_DIR} -odir ${TEST_DIR}
  @${GHC} ${TEST_DIR}/Main.o ${LIB_OBJS} -o ${TEST_EXE}

test: generate_test
  @${TEST_EXE}

Of note here is that at the final step, we're still using Main.o instead of Spec.o. Since it's an executable module, it also compiles as Main.

But we can then use this to run our tests!

$ make clean
$ make test
[1 of 2] Compiling MyFunction (src/MyFunction.hs, bin/lib/MyFunction.o)
[2 of 2] Compiling MyStrings (src/MyStrings.hs, bin/lib/MyStrings.o)
Testing case: abcd: Correct!
Testing case: Hello: Correct!

So now we have all the different components we'd expect in a normal Haskell project. So it's interesting to consider how our makefile definition would compare against an equivalent .cabal file for this project.

Comparing to a Cabal File

Suppose we want to call our project HaskellMake and store its configuration in HaskellMake.cabal. We'd start our Cabal file with four metadata lines:

cabal-version: 1.12
name: HaskellMake
version: 0.1.0.0
build-type: Simple

Now our library would expose its two modules, using the src directory as its root. The only "dependency" is the Haskell base packages. Finally, default-language is a required field.

library
  exposed-modules:
      MyStrings
    , MyFunction
  hs-source-dirs:
      src
  build-depends:
      base
  default-language: Haskell2010

The executable would similarly describe where the files are located and state a base dependency as well as a dependency on the library itself.

executable hello
  main-is: Main.hs
  hs-source-dirs:
      app
  build-depends:
      base
    , HaskellMake
  default-language: Haskell2010

Finally, our test suite would look very similar to the executable, just with a different directory and filename.

test-suite make-test
  type: exitcode-stdio-1.0
  main-is: Spec.hs
  hs-source-dirs:
      test
  build-depends:
      base
    , HaskellMake
  default-language: Haskell2010

And, if we add a bit more boilerplate, we could actually then compile our code with Stack! First we need a stack.yaml specifying the resolver and the package location:

# stack.yaml
resolver: lts-20.12
packages:
  - .

Then we need Setup.hs:

-- Setup.hs

import Distribution.Simple
main = defaultMain

And now we could actually run our code!

$ stack build
$ stack exec hello
elloH elloH
$ stack test
Testing case: abcd: Correct!
Testing case: Hello: Correct!

Now observant viewers will note that we don't use any Hackage dependencies in our code - only base, which GHC always knows how to find. It would require a lot of work for us to replicate dependency management. We could download a .zip file with curl easily enough, but tracking the whole dependency tree would be extremely difficult.

And indeed, many engineers have spent a lot of time getting this process to work well with Stack and Cabal! So while it would be a useful exercise to try to do this manually with a simple dependency, I'll leave that for a future article.

When comparing the two file definitions, Undoubtedly, the .cabal definition is more concise and human readable, but it hides a lot of implementation details. Most of the time, this is a good thing! This is exactly what we expect from tools in general; they should allow us to work more quickly without having to worry about details.

But there are times where we might, on our time, want to occasionally try out a more adventurous path like we've done in this article that avoids relying too much on modern tooling. So why was this article a "useful exercise"™?

What's the Point?

So obviously, there's no chance this Makefile approach is suddenly going to supplant Cabal and Stack for building Haskell projects. Stack and Cabal are "better" for Haskell precisely because they account for the intricacies of Haskell development. In fact, by their design, GHC and Cabal both already incorporate some key ideas and features from GNU Make, especially with avoiding re-work through dependency calculation.

But there's a lot you can learn by trying this kind of exercise.

First of all, we learned about GNU Make. This tool can be very useful if you're constructing a program that combines components from different languages and systems. You could even build your Haskell code with Stack, but combine it with something else in a makefile.

A case and point for this is my recent work with Haskell and AWS. The commands for creating a docker image, authenticating to AWS and deploying it are lengthy and difficult to remember. A makefile can, at the very least, serve as a rudimentary aliasing tool. You could run make deploy and have it automatically rebuild your changes into a Docker image and deploy that to your server.

But beyond this, it's important to take time to deliberately understand how our tools work. Stack and Cabal are great tools. But if they seem like black magic to you, then it can be a good idea to spend some time understanding what is happening at an internal level - like how GHC is being used under the hood to create our build artifacts.

Most of the fun in programming comes in effectively applying tools to create useful programs quickly. But if you ever want to make good tools in the future, you have to understand what's happening at a deeper level! At least a couple times a year, you should strive to go one level deeper in your understanding of your programming stack.

For me this time, it was understanding just a little more about GHC. Next time I might dive into dependency management, or a different topic like the internal workings of Haskell data structures. These kinds of topics might not seem immediately applicable in your day job, but you'll be surprised at the times when deeper knowledge will pay dividends for you.

Getting Better at Haskell

But enough philosophizing. If you're completely new to Haskell, going "one level deeper" might simply mean the practical ability to use these tools at a basic level. If your knowledge is more intermediate, you might want to explore ways to improve your development process. These thoughts can lead to questions like:

1. What's the best way to set up my Haskell toolchain in 2023?

2. How do I get more efficient and effective as a Haskell programmer?

You can answer these questions by signing up for my new free course Setup.hs! This will teach how to install your Haskell toolchain with GHCup and get you started running and testing your code.

Best of all, it will teach you how to use the Haskell Language Server to get code hints in your editor, which can massively increase your rate of progress. You can read more about the course in this blog post.

If you subscribe to our monthly newsletter, you'll also get an extra bonus - a 20% discount on any of our paid courses. This offer is good for two more weeks (until May 1) so don't miss out!

James Bowen 4/10/23 James Bowen 4/10/23

How to Make ChatGPT Go Around in Circles (with GHC and Haskell)

As part of my research for the recently released (and free!) Setup.hs course, I wanted to explore the different kinds of compilation commands you can run with GHC outside the context of a build system.

I wanted to know…

Can I use GHC to compile a Haskell module without its dependent source files?

The answer, obviously, should be yes. When you use Stack or Cabal to get dependencies from Hackage, you aren't downloading and recompiling all the source files for those libraries.

And I eventually managed to do it. It doesn't seem hard once you know the commands already:

$ mkdir bin
$ ghc src/MyStrings.hs src/MyFunction.hs -hidir ./bin -odir ./bin
$ ghc -c app/Main.hs -i./bin -hidir ./bin -odir ./bin
$ ghc bin/Main.o ./bin/MyStrings.o ./bin/MyFunction.o -o ./bin/hello
$ ./bin/hello
...

But, being unfamiliar with the inner workings of GHC, I struggled for a while to find this exact combination of commands, especially with their arguments.

So, like I did last week, I turned to the latest tool in the developer's toolkit: ChatGPT. But once again, everyone's new favorite pair programmer had some struggles of its own on the topic! So let's start by defining exactly the problem we're trying to solve.

The Setup

Let's start with a quick look at our initial file tree.

├── app
│   └── Main.hs
└── src
    ├── MyFunction.hs
    └── MyStrings.hs

This is meant to look the way I would organize my code in a Stack project. We have two "library" modules in the src directory, and one executable module in the app directory that will depend on the library modules. These files are all very simple:

-- src/MyStrings.hs
module MyStrings where

greeting :: String
greeting = "Hello"

-- src/MyFunction.hs
module MyFunction where

modifyString :: String -> String
modifyString x = base <> " " <> base
  where
    base = tail x <> [head x]

-- app/Main.hs
module Main where

import MyStrings (greeting)
import MyFunction (modifyString)

main :: IO ()
main = putStrLn (modifyString greeting)

Our goal is to compile and run the executable with two constraints:

1. Use only GHC (no Stack or Cabal involved)

2. Compile the library separately, so that the Main module could be compiled using only the library's build artifacts, and not the source files.

Trying to Compile

Now, there are two easy ways to compile this code if we're willing to violate our constraints (particularly the second one). If all three files are in the same directory, GHC can immediately find the modules we're importing, so we can just call ghc Main.hs.

└── src
    ├── Main.hs
    ├── MyFunction.hs
    └── MyStrings.hs

...

$ cd src
$ ghc Main.hs
$ ./Main
elloH elloH

If we keep the original file structure, it's still pretty easy if we're willing to violate constraint 2 by using the library source files in the ghc command:

$ ghc app/Main.hs src/MyStrings.hs src/MyFunction.hs
$ ./Main
elloH elloH

However, with my extra constraint of not using the library source files directly, things got a little more difficult. I originally thought "Hey, let's just include the build artifacts in the GHC call." Each library module generates an interface file (.hi) and an object file (.o). So let's try including those in the GHC call:

$ ghc src/MyStrings.hs src/MyFunction.hs
$ ls -l src/
MyFunction.hi MyFunction.hs MyFunction.o MyStrings.hi MyStrings.hs MyStrings.o
$ ghc app/Main.hs src/MyFunction.hi src/MyFunction.o src/MyStrings.hi src/MyStrings.o
...
Could not find module 'MyStrings'
...
Could not find module 'MyFunction'

As we can see, it doesn't work! Even though we're including the build artifacts as direct arguments of ghc, this doesn't help our main module to find them.

So, feeling a little too lazy to dig deep through GHC documentation, and feeling like this problem might have too many arbitrary constraints to phrase in a Google query, I decided to try our aforementioned new favorite tool.

Enter: ChatGPT

The First Session

(Conversation has been abbreviated for relevance)

How do I link existing .o files with a new source module using the 'ghc' command?

Chat GPT

To link existing object (.o) files with a new source module using the ghc command, you need to include the object files in the list of input files passed to the ghc command.

Assuming you have an existing .o file called example.o and a new source module called MyModule.hs, you can link them together into an executable program using the following command:

$ ghc -o MyProgram MyModule.hs example.o

This command tells ghc to compile and link the MyModule.hs source file with the example.o object file and create an executable program called MyProgram. The -o flag specifies the output file name.

Is it possible to use a ghc command to link a Haskell object file (.o) with a Haskell source file that imports the other module?

Chat GPT

Yes, it is possible to use the ghc command to link a Haskell object file (.o) with a Haskell source file that imports the other module. Here's an example:

Suppose you have two files, MyModule.hs and OtherModule.o, and MyModule.hs imports a module that is defined in OtherModule.o. To link the two files into an executable program, you can use the following command:

$ ghc -o MyProgram MyModule.hs OtherModule.o

This command tells ghc to compile and link the MyModule.hs source file with the OtherModule.o object file, and create an executable program called MyProgram.

Hmmm I don't think this works. I created the following file OtherModule.hs:

module OtherModule where

greeting :: String
greeting = "Hello"

And I'm trying to include it in another file:

module Main where

import OtherModule (greeting)

main :: IO ()
main = putStrLn greeting

I started by compiling OtherModule:

$ ghc src/OtherModule.hs

Then I moved OtherModule.o to the app directory (with MyModule.hs), and tried running this command:

$ cd app
$ ghc -o MyProgram MyModule.hs OtherModule.o

I get the error:

"Could not find module 'OtherModule'"

Chat GPT

I apologize for the confusion in my previous answer. You are correct that simply linking the object file with the executable does not work in this case.

When GHC compiles a module, it generates an interface file (.hi) that contains information about the module's name, type signatures, and other details. The compiled object file (.o) contains the compiled machine code for the functions defined in the module.

When you try to import a module in a source file, GHC looks for the interface file (OtherModule.hi) to find information about the module's name and type signatures. If the interface file is not found, GHC will raise an error.

In your case, when you compile OtherModule.hs, GHC generates an interface file called OtherModule.hi. To compile MyModule.hs, GHC needs to be able to find OtherModule.hi in order to resolve the import of OtherModule. Simply linking the OtherModule.o object file is not enough.

To make this work, you need to compile OtherModule.hs and MyModule.hs together, like this:

ghc -o MyProgram MyModule.hs src/OtherModule.hs

This command tells GHC to compile MyModule.hs and OtherModule.hs together, and create an executable program called MyProgram.

I hope this clears up the confusion. Again, I apologize for the confusion in my previous answer.

Ah, so I need to include both source files, and it can't be done by linking an object file with another source file?

Chat GPT

Yes, that's correct. When you import a module in a source file, GHC needs to be able to find the interface file (.hi) for that module in order to resolve the import. If the interface file is not found, GHC will raise an error.

When you compile a source file with GHC, it generates both an object file (.o) and an interface file (.hi). The object file contains the compiled machine code for the functions defined in the module, while the interface file contains information about the module's name, type signatures, and other details.

When you compile a source file that imports another module, GHC needs to be able to find the interface file for that module in order to compile the importing module. If the interface file is not found, GHC will raise an error.

So in your case, you need to compile both MyModule.hs and OtherModule.hs together so that GHC can generate the necessary interface files and link the resulting object files together to produce the executable program.

It is not possible to directly link an object file with a source file that imports it, as GHC needs the interface file to be present in order to compile the importing module.

Analysis

Suspiciously, ChatGPT suggests a less detailed command than the previous one that failed for me!

$ ghc -o MyProgram MyModule.hs OtherModule.o

It includes only the object file, instead of the object and interface file. And, of course this fails, like I showed above.

Now there's an interesting statement that will come into play later, though I didn't realize it!

Yes, that's correct. When you import a module in a source file, GHC needs to be able to find the interface file (.hi) for that module in order to resolve the import. If the interface file is not found, GHC will raise an error.

I assumed that to "find" the interface file meant providing it on the command line (as in my original command). So with that idea having failed, I fixated on one part of the final statement:

It is not possible to directly link an object file with a source file that imports it...

And concluded that getting the right information from ChatGPT might be a lost cause. It seemed like it was sure this was possible, and then just as sure that it was impossible. But details matter, as we'll see!

Working Around It

So feeling slightly defeated, I decided to poke around the areas of the code where Cabal actually saves packages to see if there was a pattern I could follow. And I managed to cobble together a convoluted workaround.

What I observed was that GHC could use any module from a library in the Cabal package database. On my system's setup, this could be found in the .ghcup cache:

$ cd ~/.ghcup/ghc/9.2.5/lib/ghc-9.2.5/
$ ls
array-0.5.4.0/
base-4.16.4.0/
...
package.conf.d/
$ ls package.conf.d
array-0.5.4.0.conf
base-4.16.4.0.conf
...

And indeed, the package directories did not have source files, only various forms of compiled object and interface files. And each package had a corresponding .conf file in the package.conf.d directory.

So I could...

Make my own directory in the package database (MyPackage-0.1.0.0)
Place the .hi and .o files in there
Create a phony MyPackage-0.1.0.0.conf and put it in package.conf.d. (Base it on an existing .conf file but make some substitutions)

And this...technically...works. But it makes me feel a little bit like:

But manually manipulating a package database wasn't the solution I had in mind though. Surely there's something easier, right?

The Second Session

After a week of sitting on this, I decided to come back to ChatGPT and see if it had any other ideas. I decided to frame my question in a different way, stating a lot more details up front. Lo and behold, it had a different answer this time!

Suppose I have a Haskell module called MyStrings:

module MyStrings where

greeting :: String
greeting = "Hello"

I can compile this module:

$ ghc MyStrings.hs

This generates an interface file, MyStrings.hi, and an object file, MyStrings.o.

Now suppose I have an executable file, Main.hs, that imports MyStrings:

module Main where

import MyStrings (greeting)

main :: IO ()
main = putStrLn greeting

Can I use ghc to compile this Main file by using the .hi and .o files, without including the imported source file, MyStrings.hs in the command?

Chat

Yes, you can use the .hi and .o files generated by compiling MyStrings.hs to compile Main.hs without including MyStrings.hs in the command. To do so, you can use the -c option to tell GHC to compile Main.hs to an object file, and then use the -o option to link the object files together into an executable:

$ ghc -c Main.hs
$ ghc Main.o MyStrings.o -o myprogram

The first command compiles Main.hs to an object file Main.o, without attempting to link it. The second command links Main.o with MyStrings.o and creates an executable called myprogram. The .hi files are not needed for linking, so you don't need to include them in the command.

Note that in order for this to work, the interface file MyStrings.hi must be present in the same directory as Main.hs when you compile it. If you move MyStrings.hi to a different directory, you'll need to use the -i option to tell GHC where to find it:

$ ghc -i/path/to/MyStrings -c Main.hs
$ ghc Main.o /path/to/MyStrings/MyStrings.o -o myprogram

Here, -i/path/to/MyStrings tells GHC to look for MyStrings.hi in the directory /path/to/MyStrings.

Analysis

So there are two key ingredients that weren't mentioned at all in our first pass: the -c option and the -i option.

Using -c performs compilation without linking. With this option, GHC is able to produce an object file for our Main module without needing its dependent objects. So .o files are unnecessary in this stage.

We still need the .hi files though. But instead of providing them on the command line, we use the -i argument to supply them. It's an odd argument, because we put the path right after the i, without any spacing.

After we're done with the first phase, then we can link all our object files together.

Solving It

And sure enough, this approach works!

$ ghc src/MyStrings.hs src/MyFunction.hs
$ ghc -c app/Main.hs -i./src
$ ghc app/Main.o ./src/MyStrings.o ./src/MyFunction.o -o hello
$ ./hello
elloH elloH

And if we want to be a little cleaner about putting our artifacts in a single location, we can use the -hidir and -odir arguments for storing everything in a bin directory.

$ mkdir bin
$ ghc src/MyStrings.hs src/MyFunction.hs -hidir ./bin -odir ./bin
$ ghc -c app/Main.hs -i./bin -hidir ./bin -odir ./bin
$ ghc bin/Main.o ./bin/MyStrings.o ./bin/MyFunction.o -o ./bin/hello
$ ./bin/hello
elloH elloH

And we're done! Our program is compiling as we wanted it to, without our "Main" compilation command directly using the library source files.

Conclusion

So with that fun little adventure concluded, what can we learn from this? Well first of all, prompts matter a great deal when you're using a Chatbot. The more detailed your prompt, and the more you spell out your assumptions, the more likely you'll get the answer you're looking for. My second prompt was waaay more detailed than my first prompt, and the solution was much better as a result.

But a more pertinent lesson for Haskellers might be that using GHC by itself can be a big pain. So if you're a beginner, you might be asking:

What's the normal way to build Haskell Code?

You can learn all about building and running your Haskell code in our new free course, Setup.hs. This course will teach you the easy steps to set up your Haskell toolchain, and show you how to build and run your code using Stack, Haskell's most popular build system. You'll even learn how to get Haskell integrations in several popular code editors so you can learn from your mistakes much more quickly. Learn more about it on the course page.

And if you subscribe to our monthly newsletter, you'll get a code for 20% off any of our paid courses until May 1st! So don't miss out on that offer!