Writing Git Hooks using Rust
Replacing a shells script with strongly typed Rust code

29 August 2017

Have you heard of Git hooks before? Git hooks are a part of the Git version control system that allows you to extend it. If you open up a Git repository on your computer and look in the .git/hooks folder, and you’ll find files with cryptic names like pre-commit.sample.

By default, when you create a repository, it won’t have any hooks, but the hooks folder is filled with examples. If you were to take that pre-commit.sample, rename it to pre-commit (remove the .sample), then it would run before any commit you make.

If the pre-commit hook fails, then your commit is rejected. Pre-commit hooks can be used to run checks against the commit before you make it. For example, you might use it to ensure that the code you’re committing actually compiles.

There are hooks before and after many Git actions, such as when Git is initializing the file that you write your commit message into (prepare-commit-msg), before you upload your changes to another server (pre-push), or whenever you change branches (post-checkout).

Git hooks aren’t all on your development machine either. You can trigger a Git hook on a server when someone is trying to push new changes (pre-receive) to prevent pushing problematic branches, or after new changes have been pushed (post-receive) for Git based deployments, triggering continuous integration builds, or server side processing of commits. I’ve been on projects before where commits were linked to the task tracking system by looking for certain lines in the commit message in a post-receive hook.

If you look at the example hooks, they are all shell scripts, but they don’t have to be. You can write a Git hook in any programming language you want, as long as the executable is in the right folder with the right name.

This week, I’ve been experimenting with writing Git hooks using the Rust programming language.

The full source code of my git hook repository is available on GitHub.

Project structure

Rust’s default project structure for binaries (as opposed to libraries) has a file called src/main.rs, with a main function in it. When you compile, this produces the executable target/debug/your_crate_name. There is also another way, which lets you have multiple executables in one project.

If you create src/bin/pre-commit.rs, and give it a main function, then when you compile you’ll get target/debug/pre-commit. You can have as many executables in src/bin/ as you want, each with their own main function, each producing their own executable. I used this approach to create multiple hooks in the same codebase.

Letting Git Know About the Hooks

To let Git know you want it to use a hook, you need to put it in .git/hooks with the correct filename. Hooks are explicitly not shared when you push or pull repos, since arbitrary code execution can be a security risk. If you’re in a team, or just jump between computers, you’ll need to add them to each machine individually.

After building, copy your executable from the build folder to the .git/hooks folder.

Examples

I’ve only written a few hooks to get started, but this can give you a good idea of generally how they look and how to write your own.

Just Logging

My first example doesn’t do anything useful, but it is a good first step in the development process. The basic inputs to Git hooks come through two channels: command line arguments passed to your program, and the program’s standard input.

This hook will log both of them.

use std::env;
use std::io::{stdin, BufRead};

fn main() {
    log();
}

// This consumes stdin. Do not call this if you need to use stdin.
fn log() {
    let name_arg = env::args().nth(0).unwrap_or(String::from("unknown"));
    let args: Vec<_> = env::args().skip(1).collect();
    println!("{} called with {:?}", name_arg, args);

    println!("BEGIN STDIN");
    let stdin = stdin();
    for line in stdin.lock().lines() {
        println!("{:?}", line);
    }
    println!("END STDIN");
}

Inserting Branch Names into Commit Messages

One of my current projects requires me to start every commit message with the current branch name. The branch name includes the ID from our issue tracking system, so it gives a convenient way to track a commit back to which use case it was solving.

To get the current branch name, I’ve included a Rust crate with bindings to LibGit2. This allows me to query the Git repository for information like the name of the current branch, with all of the checks in Rust’s type system to help me. If you were writing this in a shell script, you’d probably call the Git executable, and parse the response on the command line.

extern crate git2;
use git2::Repository;

use std::fs::File;
use std::io::{stdin, BufRead, Write, Read};
use std::process;
use std::env;

fn main() {
    let commit_filename = env::args().nth(1);

    // the commit source will will be filled with labels like 'merge'
    // to say how you got to this point. I only want to handle the
    // case where it's empty, meaning it's a normal new commit with no
    // special message-related arguments (not -m)
    let commit_source = env::args().nth(2);

    let current_branch = get_current_branch();

    match (current_branch, commit_filename, commit_source) {
        (Ok(branch), Some(filename), None) => {
            let write_result = prepend_branch_name(branch, filename);
            match write_result {
                Ok(_) => {},
                Err(e) => {
                    eprintln!("Failed to prepend message. {}", e);
                    process::exit(2);
                }
            };
        },
        (_, _, Some(_)) => {
            // do nothing silently. This comes up on merge commits,
            // amendment commits, if a message was specified on the
            // cli.
        }
        (Err(e), _, _) => {
            eprintln!("Failed to find current branch. {}", e);
            process::exit(1);
        },
        (_, None, _) => {
            eprintln!("Commit file was not provided");
            process::exit(2);
        }
    }
}

fn get_current_branch() -> Result<String, git2::Error> {
    let git_repo = Repository::discover("./")?;
    let head = git_repo.head()?;
    let head_name =  head.shorthand();
    match head_name {
        Some(name) => Ok(name.to_string()),
        None => Err(git2::Error::from_str("No branch name found"))
    }
}

fn prepend_branch_name(branch_name: String,
                       commit_filename: String) -> Result<(), std::io::Error> {
    // It turns out that prepending a string to a file is not an
    // obvious action. You can only write to the end of a file :(
    //
    // The solution is to read the existing contents, then write a new
    // file starting with the branch name, and then writing the rest
    // of the file.

    let mut read_commit_file = File::open(commit_filename.clone())?;
    let mut current_message = String::new();
    read_commit_file.read_to_string(&mut current_message)?;

    let mut commit_file = File::create(commit_filename)?;

    writeln!(commit_file, "{}:", branch_name)?;
    write!(commit_file, "{}", current_message)
}

If you were writing this in a shell script, it would be much more concise. Getting the branch name can be done in one line, and the prepending the name to a file can be done in another one line using tools like sed.

If you’re interested in seeing shell script approaches, there are a number of them on this Stack Overflow thread.

However, I find that these are much more brittle. The normal way of using variables is a simple text substitution.

sed -i "1s/^/$branchName: \n/" $1

If you happen to have special characters like / in your branch name, then it will conflict with the sed expression and you’ll get strange results. I actually had this problem, where my shell script based Git hook suddenly stopped working and stopped me from committing when faced with an unexpected branch name. My Rust hook is much more explicit about its error handling, thanks to Rust’s error handling system making it explicit when errors should be expected.

Ensuring Unit Tests Pass Before Committing

This last example is something that you can do in Rust, but if this is all you’re doing it’s complete overkill. This hook will compile your project, run the unit tests, and abort the commit if they don’t pass.

use std::process;
use std::process::{Command, Stdio};

fn main() {
    let command = Command::new("cargo")
        .arg("test")
        .stdout(Stdio::inherit())
        .stderr(Stdio::inherit())
        .output()
        .expect("failed to execute Cargo");

    process::exit(command.status.code().unwrap_or(0));
}

Using Rust here may be useful if you want to handle any special cases around the build and the commit, or if you want to have your hook check other things as well. In its current state, this Rust hook is functionally identical to this shell script:

#!/bin/sh
cargo test

If all that your hook is doing is calling another program, don’t bother to do the call in Rust.

The Good

As usual, Rust shines in its type system giving you compile time checks of your program. I had never used LibGit before, but once I had something that compiled it not only worked, but was making explicit all of the places where it could fail.

I didn’t have any particularly complicated logic in my Rust code, but if your hook needs to do clever parsing of commits or their messages, then it’s also good to remember that it’s easy to include unit tests in Rust code.

The Bad

In the case of my example hook that called Cargo, it was larger than the shell script equivalent, needed to be compiled before including, and didn’t add anything that the shell script wasn’t already doing. Rust can be the right tool for many jobs, but this particular use case was not one of them.

If the script had needed to take the output from Cargo’s test and parse it in some complicated way, then Rust would become a good fit again.

Something else to consider is that shell scripts, since they’re typically farming out the heavy lifting to other applications, can have ridiculously small file sizes. A shell script only calling Cargo is 21 Bytes! The equivalent Rust program is 318 Bytes, but compiles to a 3.5 Megabytes as it statically links in parts of the Rust standard library and the other Rust crates you use. This size shouldn’t be a problem unless you’re planning on committing the compiled binaries to your Git repository and you have slow Internet connections, but there are other problems with committing the binaries.

The Ugly

It turns out that sharing compiled binaries between different computers can be a bit of a pain. If you compile a Rust program on one Linux machine, then copy just the executable to another Linux machine, you might need to spend some time making sure you have all of the expected native libraries installed in the right place. For my branch name inserting hook, I was using LibGit, which depended on OpenSSL. When I copied it to a different machine, I needed to make sure that I also had the appropriate OpenSSL libraries installed there before it would work.

The Rust code itself is portable, and most of the difficulties in cross platform support are handled at compile time. Compile time is also when all of the most useful error messages around missing dependencies come up. That’s great, just as long as you’re compiling on your target machine. That does mean however that everyone needs to have the Rust tool chain installed.

On the plus side, this means that you can share Git hooks between Windows, Mac and Linux machines, as long as you’re willing to compile them separately on each machine. Depending on what you’re trying to do, this might be easier than writing a multi-platform shell script.

Should I Write Git Hooks in Rust?

Maybe.

It depends on what you’re trying to solve with your hook. Certainly if you have more complicated logic, then Rust will make your hook easier to maintain, provide better runtime checks, and make the conditions under which your hook rejects commits much more explicit.

These benefits do have a cost in complexity to compile and install them, so before you run off and rewrite your shell scripts in Rust you should make sure that it’s doing something more complex than calling a single other program.