2023-07-30T12:00:00

Software I Love: Gerrit


Like almost all software engineers a large proportion of my time is spent reviewing other people's code. Perhaps because I spent so much of my work life - and therefore my life in general - on reviews I have some pretty strong opinions on them. Those will have to wait for another post however, today I want to talk about code review tools and specifically Gerrit.

Over my so far relatively short career I've used a few different tools for code review: Review Board, GitLab, Github, and Gerrit.

All but the first one with Git as the version control system backing it. I hope this gives me a fair overview of the tools available for code review. Gerrit really stands out to me because it encourages you to do things which are considered best practice - atomic commits, detailed commit messages, breaking tasks up etc.

You have probably guessed that I do not think that is true of the other tools I list, so let us start there. Given the Github style review system is the one most people are familiar with I will focus on that, but as far as I remember it applies equally to GitLab.

A code review in Github

You have just finished explaining the ins and outs of your codebase to a new engineer and they're ready to submit their first small fix. A first review is a really important point in the on-boarding process: it sets standards - whether they are code style, quality or politeness - and it is probably the first piece of satisfaction they will have in their new job.

It is important then to hit the ground running, then. You get them to run git config --global user.email and ask them to open a pull request.

The pull request screen

The first thing you are hit with when opening a pull request is a massive text box where you are invited to describe your changes. I've so far had two experiences of the PR model and they are:

  1. 70% (a completely made up figure) of PRs have nothing written here as it is accurately described in the ticket or in the commit message.
  2. I'm forced to go through a massive template with various check boxes, description fields and questions that as a reviewer I don't bother to read because looking at the code is almost always quicker.

Perhaps I've just been unlucky but I wouldn't be surprised if this was a common occurrence. I think a big problem with the PR model is it encourages either contributions that have no context or contributions with too much context. It might be my suspicion of vendor lock-in but ideally the context for your changes should live in either the commit message or in documentation around your code - not in some other company's database!

First review

Like most or all services you get a side-by-side view of the changes and comments can be added by the little blue plus that appears on hover. Little being an operative word because when I first started using GitHub it took me a while to find it. It also won't stick around for a screenshot so I'm I afraid I can't show you it.

In any case let's say the reviewer finds a couple of issues they want fixing and drops comments saying as much.

Responding to the review

When doing reviews there are usually a couple of options on how to commit after making a change:

  1. Amend the original commit(s) and force push
  2. Add an extra commit with a title along the lines of "addressed review comments".

The former would be my preferred choice but unfortunately it's a terrible experience for the reviewer on GitHub. When you do a force push it is helpfully called out by the UI and you at least some of the time get a "See changes" button. Unfortunately I've found this feature to be very hit and miss, often seeing 404 style pages. Not only that but there is no easy way to see the history - i.e. a list of force pushes. What if the reviewer decides the old code was better and we want to go back? Hopefully you still have it in your reflog but enjoying working out which entry to go back to!

So that leaves extra commits. These provide a nicer experience for the reviewer and arguably the reviewee too. Unfortunately in the long-term they're pretty bad for your repository's health. You either merge all those commits in and have the history peppered with tiny formatting fixes, or you squash all the commits and lose any separation.

A code review in Gerrit

No software is without problems, so the first step of using Gerrit is probably a senior having to explain a tonne of stuff. I do think this is generally down to the ideas of Gerrit being a bit trickier to grasp, but a lot of it is due to unfamiliarity. Almost all engineers will have used to GitHub or a similar service, I doubt 5% will have used Gerrit.

The good news is that opening a request for a review (a PR if you like) is just a single git push. Granted, that git push is a bit weird:

git push HEAD:refs/for/master

but you can usually ignore the strange bit tacked on the end. I think the lack of a UI and having to fill in a description here is good. It encourages you to include the context required to either be in the commit message or in the documentation that lives in your repository alongside the code - which I'm sure we all have ;)

First review

One of my annoying habits - at least to people who watch me use a computer - is when reading I'll click around and select text as I read. If you do this on a diff in Gerrit you will get the helpful suggestion to press 'c' to comment. This highlights two nice things about Gerrit:

  1. The keyboard shortcuts are second to none.
  2. You can comment on anything you can select.

Number two really is great - it means it is obvious how to comment on multiple lines as well as pick a single word - e.g. if it is a misspelt word in a comment.

The first "file" in the changed list is the commit message. This reinforces the idea that the git history of your repository is something worth consulting and can be used as an educational tool. The best codebases I've worked on or with have been ones where if I am confused about why a change was made I can do a git log and find out.

Responding to the review

When responding to a review it is possible to add a new commit. Gerrit strongly encourages - both through documentation and the UI - that you should edit your commit. This encourages you to keep your history clean and to make atomic commits. Previously when I have worked with GitHub I've had no incentive to make my commits make sense as atomic units - I've known the feature branch will be squashed and merged so why make the effort?

In Gerrit however each commit is an individual thing, and if you push a series of commits then it becomes a "relation chain"; sometimes called a commit stack elsewhere. Each commit is merged separately which now means you have a much greater reason to break things up: sure, you may not get the entire relation chain in at once - or at all - but you might get a few in which overall improve the product before the next release.

Completing the review

Once you have made your changes you do the same git push as before. Gerrit requires your commit messages to have a Change-Id trailer, which is how it tells you are updating a review rather than pushing a new one. Unlike GitHub this isn't a bad experience for the reviewer, quite the opposite! For each commit in the chain you can select between which version - called a patchset - you want to review. You can even select which versions you diff between meaning if the code was good and you just wanted to make sure your comments were addressed you can see what changes in the last push.

Finally Gerrit gives the reviews a bit more granularity when it comes to reviews. A reviewer can give a change a -2, -1, 0, +1, or +2. The negative numbers mean "please don't submit this" and "I'd prefer if you didn't submit this" respectively. Whereas the positive numbers mean "looks good to me, but I would like someone else to review" and "looks good, can submit". This helps to get sign-off for example if you are working on API changes that a consumer library need to make sure works for them but they don't have the authority to approve a commit to the repo.

Different philosophies

Overall I feel like although there are features that GitHub is missing compared to Gerrit the reason for this is they just have two different philosophies. GitHub is geared to making it easy to contribute. Gerrit is geared to making it easy to make sure your code is high quality. This can be seen from the fact that GitHub is a service, and Gerrit is a piece of software you have to run yourself.

I think ultimately most companies would benefit from Gerrit's approach. If you're happy to invest the time in your employees - and I hope people are - then a learning curve is worth it if it makes the codebase better. Obviously if you are in an open-source company the decision leans more in GitHub's favour, but there are plenty of companies who open-source there software but don't get many contributions. In these situations Gerrit still feels like a good fit.

I'd encourage everyone to give Gerrit a whirl. The starting cost is quite high so if you just wish to dip your toes in it may be worth trying out something like reviewable which tries to improve the GitHub reviewing experience with some of the ideas in Gerrit that I've discussed here.