Hpack: A Simpler Package Format

backpack.jpg

In the last few weeks, we've gone through the basics of Cabal and Stack. These are two of the most common package managers for Haskell. Both of these programs help us manage dependencies for our code, and compile it. Both programs use the .cabal file format, as we explored in this article.

But .cabal files have some weaknesses, as we'll explore. Luckily, there's another tool out there called Hpack. With this tool, we'll use a different file format for our project, in a file called package.yaml. We'll run the hpack command, which will read the package file and generate the .cabal file. In this article, we'll explore how this program works.

In our free Stack mini-course, you'll learn how to use Stack as well as Hpack! If you're new to Haskell, you can also read our Liftoff series to brush up on your skills!

Cabal File Issues

One of the first weaknesses with the .cabal file is that it uses its own unique format. It doesn't use something more common like XML, JSON, YAML, or Markdown. So there's a small learning curve when it comes to questions of format. For instance, what are good indentation practices? What is the "correct" way to make a list of things? When are commas necessary, or not? And if, for whatever reason, you want to parse whatever is in your package file, you'll need a custom parser.

When using Hpack, we'll still have a package file, package.yaml. But this file uses a YAML format. So if your previous work has involved YAML files, that knowledge is more transferable. And if you haven't yet, it's likely you will use YAML at some point in the future. Plus every major language can parse YAML with ease.

If you're making a project with many executables and tests, you'll also find your .cabal file has a lot of duplication. You'll need to repeat certain fields for each section. Different executables could have the same GHC options and language extensions. The different sections will also tend to have a lot of dependencies in common

In the rest of this article, we'll see how Hpack solves these problems. But first, we need to get it up and running.

Installing and Using Hpack

The Hpack program is an executable you can get from Stack. Within your project directory, you just need to run this command:

stack install hpack

After this, you should be able to run the hpack command anywhere on your system. If you run it in any directory containing a package.yaml file, the command will use that to generate the .cabal file. We'll explore this package file format in the next section.

When using Hpack, you generally not commit your .cabal file to the Github repository. Instead, put it in .gitignore. Your README should clarify that users need to run hpack the first time they clone the repository.

As an extra note, Hpack is so well thought of that the default Stack template will include package.yaml in your starter project! This saves you from having to write it from scratch.

Package File

But how is this file organized anyway? Obviously we haven't eliminated the work of writing a package file. We've just moved it from the .cabal file to the package.yaml file. But what does this file look like? Well, it has a very similar structure to the Cabal file. But there are a few simplifications, as we'll see. To start, we have a metadata section at the top which is almost identical to that in the Cabal file.

name: MyHpackProject
version: 0.1.0.0
github: jhb563/MyHpackProject
license: MIT
author: "James Test"
maintainer: "james@test.com"
copyright: "Monday Morning Haskell 2020"

extra-source-files:
  - README.md

These lines get translated almost exactly. Various other fields get default values. One exception is that the github repository name will give us a couple extra links for free in the .cabal file.

-- Generated automatically in MyHpackProject.cabal!
homepage: https://github.com/jhb563/MyHpackProject#readme
bug-reports: https://github.com/jhb563/MyHpackProject/issues

source-repository head
  type: git
  location: https://github.com/jhb563/MyHpackProject

After the metadata, we have a separate section for global items. These include things such as dependencies and GHC options. We write these as top level keys in the YAML. We'll see how these factor into our generated file later!

dependencies:
  - base >=4.9 && <4.10

ghc-options:
  - -Wall

Now we get into individual sections for the different elements of our package. But these sections can be much shorter than they are in the .cabal file! For the library portion, we can get away with only listing the source directory!

library:
  source-dirs: src

This simple description gets translated into the library section of the .cabal file:

library
  exposed-modules:
      Lib
  other-modules:
      Paths_MyHpackProject
  hs-source-dirs:
      src
  build-depends:
      base >=4.9 && <4.10
  default-language: Haskell2010

Note that Paths_XXX is an auto-generated module of sorts. Stack uses it during the build process. This is one of a few different parts of this section that Hpack generates for us.

Executables are a bit different in that we group them all together in a single key. We use the top level key executables and then have a separate sub-key for each different binary. These can have their own dependencies and GHC options.

executables:
  run-project-1:
    main: Run1.hs
    source-dirs: app
    ghc-options:
      - -threaded
    dependencies:
      - MyHpackProject
  run-project-2:
    main: Run2.hs
    source-dirs: app
    dependencies:
      - MyHpackProject

From this, we'll get two different exectuable sections in our .cabal file! Note that these inherit the "global" dependency on base and the GHC option -Wall.

exectuable run-project-1
  main-is: Main.hs
  other-modules:
      Paths_MyHpackProject
  hs-source-dirs:
      app
  build-depends:
      MyHpackProject
    , base >=4.9 && <4.10
  ghc-options: -Wall -threaded
  default-language: Haskell2010

executable run-project-2
  ...

Test suites function in much the same way as executables. You'll just want a separate section tests after your executables.

Module Inference

So far we've saved ourselves from writing a bit of non-intuitive boilerplate. But there are more gains to be had! One annoyance of the .cabal file is that you will see error or warning messages if any of your modules aren't listed. So when you make a new module, you always have to update .cabal!

Hpack fixes this issue for us by inferring the layout of our modules! Notice how we made no mention of the individual modules in package.yaml above. But they still appeared in the .cabal file. If we don't specify, Hpack will search our source directory for all Haskell source files. It will assume they all go under exposed-modules. So even if we have a few more files, everything gets listed with the same basic description of our library.

-- Files in source directory
-- src/Lib.hs
-- src/Parser.hs
-- src/Router.hs
-- src/Internal/Helpers.hs

...
-- Hpack Library Section
library:
  source-dirs: src

-- Cabal File Library Section
library
  exposed-modules:
    , Internal.Helpers
    , Lib
    , Parser
    , Router
  other-modules:
      Paths_MyHpackProject
  ...

Hpack also takes care of alphabetizing our modules!

There are, of course, times when we don't want to expose all our modules. In this case, we can list the modules that should remain as "other" in our package file. The rest still appear under exposed-modules.

-- Package File
library:
  source-dirs: src
  other-modules:
    - Internal.Helpers

-- Cabal File
library
  exposed-modules:
    , Lib
    , Parser
    , Router
  other-modules:
      Internal.Helpers
  ...

If you want the external API to be more limited, you can also explicitly list the exposed modules. Hpack infers that the rest fall under "other".

-- Package File
library:
  source-dirs: src
  exposed-modules:
    - Lib

Remember that you still need to run the hpack command when you add a new module! Otherwise there's no update to the .cabal file. This habit takes a little while to learn but it's still easier than editing the file each time!

Reducing Duplication

There's one more area where we can get some de-duplication of effort. This is in the use of "global" values for dependencies and compiler flags.

Normally, the library, executables and test suites must each list all their dependencies and the options they need. So for example, we might find that all our elements use a particular version of the base and aeson libraries, as well as the -Wall flag.

library
  ghc-options: -Wall
  build-depends:
      base >=4.9 && <4.10
    , aeson
  ...

exectuable run-project-1
  ghc-options: -Wall -threaded
  build-depends:
      MyHpackProject
    , aeson
    , base >=4.9 && <4.10
  ...

With Hpack, we can simplify this by creating global values for these. We'll add dependencies and ghc-options as top level keys in our package file. Then each element can include its own dependencies and options as needed. The following will produce the same .cabal file output as above.

dependencies:
  - base >=4.9 && <4.10
  - aeson

ghc-options:
  - -Wall

library:
  source-dirs: src

executables:
  run-project-1:
    ghc-options:
      - -Wall
    dependencies:
      - MyHpackProject

Conclusion

Hpack isn't a cure-all. We've effectively replaced our .cabal file with the package.yaml file. At the end of the day, we still have to put some effort into our package management process. But Hpack saves a good amount of duplicated and manual work we would need to do if we were using the .cabal file by itself. But you need to remember when to run Hpack! Otherwise it can get frustrating. Whenever you have some event that would alter the .cabal file, you need to re-run the command! Do it whenever you add a new module or build dependency!

Next week, we'll start looking at Nix, another popular package manager among Haskellers!

Previous
Previous

Nix: Functional Package Management!

Next
Next

Nicer Package Organization with Stack!