Contributing to GHC 4: Real Issues

In this series, we’ve taken a good look at GHC. We started by looking at the steps we would need to prepare our local machine for GHC development. This was an especially difficult process on Windows, so we focused there. After that, we looked at the basic way of creating a development cycle for ourselves. We validated that by changing an error message and seeing how it appeared in the compiler. In part 3, we made some more complicated changes. In this final part, we’re going to wrap this series up by looking at some basic ways of making contributions.

As a reminder, there's a wide world of fun Haskell outside of GHC as well! Download our Production Checklist and read our Haskell Web Series for some more ideas!

Documentation

Documentation is a tricky thing on any software project. At any given moment, most of the effort is going into making sure the program works as it ought to. When you understand the code already, you don’t need to look at the documentation. So the temptation is to not change any of the comments. This means documentation is always likely to fall out of date. Haskell, if anything, is more prone to this kind of lapse. We look for issues by making changes, compiling, and seeing what breaks. And documentation never breaks!

Experienced developers will remember to change documentation more. Still though, it’s inevitable that something will slip through the cracks. But there's good news for us as newcomers to the GHC code base! We’re in the best position to find holes in the documentation, since we’re the ones who need to read it most! This is how I found the first contribution I could make.

While exploring the lexing types, I found a comment that didn’t quite make sense. At the top of compiler/basicTypes/BasicTypes.hs, it states:

-- There is considerable overlap between the logic here and the logic
-- in Lexer.x, but sadly there seems to be way to merge them.

That doesn’t quite read right. From the context, it seems pretty clear that the author intended to write “there seems to be no way to merge them”. Great, so let’s submit a pull request for this! We’ll fork the repository and open a pull request. So we’ll create our fork, clone the repo, open a new branch, and open a pull request against master.

Now there’s a somewhat annoying issue with the fact that the CI builds don’t actually seem to be passing right now. But hopefully this PR will get merged in at some point.

Issue Tracking with Trac

Of course, there are also much more complicated issues at stake with GHC. There’s the real features we want to add to the codebase, and the bugs we want to fix! To take a look at what’s going on there, you’ll need to look at the issue tracker. GHC uses Trac for this, and you can observe all the issues on that list. They have labels based on what release they’re for, and how important they are.

It can be quite an overwhelming list. I scrolled through many different tickets and wasn’t sure what I could actually help with. So how can you find something to start out with? First, you can subscribe to the GHC devs mailing list. Conversations there will help you find what people are working on. Second, you can log onto Freenode and get onto the #ghc channel. You can ask anyone what’s going on and where you might help. Luckily, there is also a tag for “newcomers” on the list of issues. These are issues that the GHC devs have highlighted should be easy for people new to the codebase. Let’s take a look at one of these issues.

Looking at a Real Issue: Infix Patterns

From this hunt, I found this ticket, related to the infix value of (->). The ticket claims that the stated infix level of 0 for the arrow operator is actually incorrect. Let’s take a look at what they mean.

As a reminder, the infix level states an operator's priority when determining order of operations. For instance, the multiplication operator (*) has a higher infix level than the addition operator (+). We can confirm this information with a quick ghci session by using the :info command on each of these.

>> :i (+)
...
infixl 6 +
>> :i (*)
...
infixl 7 *
>> 5 + 2 * 3
11 -- Would be 21 if addition were higher precedence

Now, when two operators have the same infix level, then we refer to the direction of the infix level. As an example, we can compare subtraction to addition. We’ll find it's also infixl 6. Since it’s infixl (as opposed to infixr), we give the left side operation priority. Here’s an example.

>> :i (-)
...
infixl 6 -
>> 5 - 2 + 18
21 -- Not (-15)

So let’s look at our arrow operator, which we use when defining our type signatures:

>> :i (->)
data (->) (a :: TYPE q) (b :: TYPE r) -- Defined . `GHC.Prim`
infixr 0 `(->)`
...

This suggests an infix level of 0 for this operator, and that we should prioritize things on the right. However, the person filing the bug suggests the following code:

{-# LANGUAGE TypeOperators #-}

module Bug where

import Data.Type.Equality

type (~>) = (->)
infixr 0 ~>

f :: (a ~> b -> c) :~: (a ~> (b -> c))
f = Refl

There’s a lot going on here with some higher level concepts, so let’s break it all down. First, (->) is a type operator, meaning that it itself is actually a type. Thus we can create a type synonym for it called (~>). Then we can assign this new operator to have whatever infix level we like. In this case, we’ll choose the same stated infix level as we have for the original operator, infixr 0.

The next part creates an expression f. Its type signature uses the (:~:) operator for relational equality between types. This type has the Refl constructor. The only thing you need to understand is that each of our arrow patterns ((a ~> b -> c) and (a ~> (b -> c))) is a type. And this code should only compile if those types are the same.

And on the face of it, these types should be the same. After all, both operators purport to be infixr 0, meaning the way we parenthesize it on the right side of (:~:) should match how it is naturally ordered. But the code does not compile!

>> ghci
>> :l Bug.hs
Bug.hs:11:5: error:
    * Couldn’t match type `a` with `a ~> b`
      `a` is a rigid type variable bound by
        f :: forall a b c. ((a ~> b) -> c) :~: (a ~> ( b -> c))
        At Bug.hs:10:1-38
      Expected type: ((a ~> b) -> c) :~: (a ~> (b -> c))
        Actual type: ((a ~> b) -> c) :~: ((a ~> b) -> c)
    * In the expression: Refl
      In an equation for `f’: f = Refl
    * Relevant bindings include
      f :: ((a ~> b) -> c) :~: (a ~> (b -> c))
        (bound at Bug.hs:11:1)
   |
11 | f = Refl
   |

We can see on the “Actual type” line how the compiler interprets (a ~> b -> c). It gives priority to the left, not the right. Indeed, if we change the type signature to reflect priority given to (~>), our code will compile:

f :: (a ~> b -> c) :~: ((a ~> b) -> c)
f = Refl
...
>> ghci
>> :l Bug.hs
Ok, one module loaded.

The Fix

The fix, luckily for us, has already is already proposed in the ticket. The compiler represents the infix level of our operators using the Fixity type. We can see a particular location where we’ve defined the level for some of our built-in operators:

negateFixity, funTyFixity :: Fixity
negateFixity = Fixity NoSourceText 6 InfixL -- Fixity of unary negate
funTyFixity = Fixity NoSourceText 0 InfixR -- Fixity of `->`

We want to change the fixity of the function type operator. Instead of it appearing to be 0, we should make it appear to be -1, showing the lower precedence of this operator. Note this code refers to our we report it. The actual reasons why it ends up having lower priority are more complicated. But let’s make that change:

funTyFixity = Fixity NoSourceText (-1) InfixR

Testing Our Change

This seems like it should be a simple change to test. First, we’ll make our code again. Then we’ll boot up GHCI and ask for info on (->). But this doesn’t appear to work when we try it!

> make
> ghci
...
>> :i (->)
data (->) (a :: TYPE q) (b :: TYPE r) -- Defined . `GHC.Prim`
infixr 0 `(->)`
...

The issue here is that re-making does not cause GHCI to use our new locally built version of GHC. Even when using ghci.exe from within the ghc/inplace/bin directory, it still doesn’t account for this change. The way around this is that instead of using ghci, we can pass the --interactive flag to a normal call to ghc. So we’ll want something like this:

~/ghc/inplace/bin/ghc-stage2.exe -o prog --interactive Main.hs

This will bring up a GHCI prompt that loads our main module. And now when we go ahead and get info, we’ll see that it works!

> ~/ghc/inplace/bin/ghc-stage2.exe -o prog --interactive Main.hs
...
>> :i (->)
data (->) (a :: TYPE q) (b :: TYPE r) -- Defined . `GHC.Prim`
infixr -1 `(->)`
...

So I’ll now make a simple pull request addressing this bug. You can follow the progress here. I’ll update this post as it moves further along in the process.

Conclusion

This wraps up our series on contributing to GHC! There are a lot of bugs out there, so don’t be afraid to take a look at anything labeled as newcomer. Just make sure to take a look at the discussion that’s occurred already on the ticket!

To learn more about Haskell, you can read our Liftoff Series (for beginners) or our Haskell Web Series if you’re already familiar with the language. You can also download our Haskell Beginners Checklist to get started! Or you can look at our Production Checklist if you want some ideas for more advanced projects.