Git hooks for fun & profit

Guy Waldman

Guy Waldman

Well, my friends, it's been a while since my last blog post but it's good to be back. Last one was a bit bird's eye and hand-wavy, so on this occasion let's get practical.

Note: This one goes out to Or Bendel from my team, with whom we discussed this useful feature 😊

As with every one of my blog posts, if you want the tl;dr feel free to go over the orange "Key Takeaways" sections.

Today, I wanted to share a quick workflow that I think can help developers (especially ones who aren't prone to scrutinizing their changes before committing) to help make sure no undesired changes are pushed.
I will be addressing a common pain point for developers - how do I avoid accidentally committing or checking in so-called DNC (Do Not Commit) comments/phrases in source code?
Leaving unintended snippets of code in code reviews can not only require redundant code review rounds and add "noise" to the code review process, but also cause unintentional changes to be checked in at times (I remember the awkwardness of slipping in a naive "Test!" console.log earlier in my career).

So, let's start with the question - what are DNC messages anyway? These may appear as comments that you add while developing to remember to go back and change something.

For example:

1function doStuff() {
2  console.count("doStuff was called"); // DNC
3  // ...
4}
js

I often use "TODO(PR)" or "TODO(DNC)" as a reminder to myself to go back and resolve that thing before publishing a pull request (or, God forbid, before checking in with TFS), but in general this may be any arbitrary VCS that you use.
Personally, I prefer to use uncommon expressions so that I can quickly search the code to make sure I don't actually introduce them in my changes.

In general, when I asked various developers about this issue this is how the conversation went:

  1. Do you ever add temporary comments for yourself before committing for changes you don't want to introduce to the pull request?
    Guy Waldman
  2. Yeah, I use <X> or <Y>.
  3. And how do you make sure you don't accidentally commit them?
    Guy Waldman
  4. I diff my changes.
  5. So you do it manually?
    Guy Waldman
  6. Yep.
  7. And nothing ever slipped past your scrutiny?
    Guy Waldman
  8. Well...

So, we have a problem which seems to require manual solutions using an error prone process that trips up even the most thorough and diligent developers. Well, what do productive (see: bored) humans do for problems that require a routine manual step? They automate it, of course!

Obligatory xkcd (how could I not?)
(click to zoom)

...Oh, how to automate it, you ask? Well, git hooks might be a good option!

I will note that this is a good thing to enforce in your CI (Azure Pipelines, GitHub Actions, TravisCI, etc.). In fact, Dell have a GitHub Action that forbids certain words (in this specific case, non-inclusive language) which is a great use-case for a CI gate! However, in this case these DNC messages may be "personal", by design, as as you don't want to mix up your DNCs with general TODOs in the codebase, since you would want to easily search for them to spot your own changes. For example, I could use "TODO(PR)" for my DNC phrases and you could use "DO NOT COMMIT").

Key takeaways
  • Developers sometimes add temporary comments to code changes which may be accidentally introduced in the code review
  • This is often an error-prone process which can be easily automated with git hooks

git hooks

What are git hooks?

git hooks are a way to "hook" into stages of the git version control lifecycle, and give the ability to run scripts at certain events, for example before a commit, after a commit or before a push.

Some examples of common hooks are:

  • pre-commit (before committing): often used for linting/formatting
  • commit-msg (after submitting the commit message): often used for enforcing styles on commit messages
  • post-commit (after committing): often used to tag commits

An exhaustive list can be found here.

There are server-side hooks which can run on the VCS server, but we'll be focusing on client-side scripts, which are triggered by local events in your local git repository.

Atlassian have a great tutorial about git hooks, which goes much more in-depth than this humble blog post.

Where do the hooks come from?

In general, there are two ways I know of to run hooks:

  1. Local hooks which reside in the .git/hooks directory in the local git repository.
    These are scoped to your local repository.

    These can be added automatically if you define a so-called git Template Directory. Whenever you define a Template Directory, it will be used as the "boilerplate" when you run git init. By default, when you git init, your local repo should contain sample commit hooks in the .git/hooks directory, with hooks containing a .sample file extension. The .sample extension actually prevents them from being actually registered/installed (once removed, they should run normally). by git as real hooks.

  2. Global hooks which you can configure in one place which git will search for, regardless of the local repo you're in (useful for a central location of all your git hooks)

    Notes:
    • From my experiments, these seem to override the local hooks. i.e., instead of looking at the local hooks, git will only use the global ones.
    • core.hooksPath was introduced in version 2.9 so check whether your git version supports it (run git --version)

There is also the option of using local hooks with symbolic links (symlinks) to the centralized hooks location. As mentioned, using Template Directories you can define the skeleton for new git repos, and you could theoretically define symlinks to your global hooks. This is a more "explicit" approach, which may be preferable to some.

In essence, we can boil the options down to:

(click to zoom)

I chose option C (global hooks) for my workflows, because I have a dotfiles folder containing all the shell stuff I like to configure for my different machines (I have a MacBook, a Lenovo laptop, occasionally a VM/desktop) and GitHub Codespaces (did I mention Microsoft is the greatest company in the world?).
Within my dotfiles repo which I take "with me" across machines and use symbolic links to reference, I have a directory for git hooks. I assume that there are no project-specific hooks, since they would otherwise override those.

In general, I would advise that things you wish to enforce for changes happen in your CI, for example enforcement of linting, style guides etc., as git hooks are a "client" mechanism and in general any developer can override those (for example with git commit --no-verify).

Setting up global git hooks

The git configuration core.hooksPath (link) controls the directory git uses for finding the hook to run.

For configuring the global hooks directory (I chose /etc/git/hooks for this example) you can run:

1git config --global core.hooksPath /etc/git/hooks
bash

Note: on UNIX systems you'll need to make the script executable (e.g., chmod +x /etc/git/hooks/pre-commit)

Key takeaways
  • Git hooks are a mechanism for triggering a script to run at certain points of your git workflows (before a commit, after a push, etc.)
  • There are client-side hooks (which run on your machine) and server-side hooks (which run on the machine hosting your repository)
  • There are local hooks which are scoped to your local repository and global hooks which you could leverage for hooks you want to run from any repo
  • I chose global hooks for this blog post since they were the ones I found most suitable for my workflows
  • You can configure the global git hooks using the core.hooksPath configuration

Pre-commit hook to forbid DNC messages

So, going back to our original problem - how do we use git hooks to forbid "Do Not Commit" TODOs?

What I did was a add a pre-commit shell script to my global hooks folder (which in the example above was /etc/git/hooks, but in my case was my dotfiles repo).

I have prepared a BASH version (which I'll go over in more detail) and a PowerShell version. Both are available in a GitHub Gist I prepared for your convenience: link to Gist

Note that you could even use Python if you wanted to.
e.g., if you were so inclined, your pre-commit could look like this (in UNIX systems):

1#!/usr/bin/env python
2
3print("Hello!")
python

Also note that the scripts aren't perfect - they may leave much to be desired, and I welcome contributions (tweet at me!).

Pre-commit hook (BASH)

The following is the content of my pre-commit script.

1#!/usr/bin/env bash
2
3# Forbidden phrases.
4FORBIDDEN_PHRASES=("DNC" "DO NOT COMMIT" "TODO(PR)")
5
6# ANSI color codes.
7CLEAR="\033[0m"
8RED="\033[00;31m"
9BLUE="\033[00;34m"
10
11violation_output=""
12
13# Go over the staged changes, and if any of the forbidden phrases is used,
14# add an appropriate message to the output.
15for forbidden_word in "${FORBIDDEN_PHRASES[@]}"; do
16	changed_file_names=$(git diff --cached --name-only)
17	for changed_file_name in $changed_file_names; do
18		changed_file_content=$(git diff HEAD --no-ext-diff -U0 --exit-code -a --no-prefix $changed_file_name | egrep "^\+")
19		if echo $changed_file_content | grep -q "$forbidden_word"; then
20			violation_output+="${CLEAR} β€’ ${BLUE}${changed_file_name}${CLEAR} contains ${RED}\"$forbidden_word\"${CLEAR}\n"
21		fi
22	done
23done
24
25# If there are any violations, print the output and exit with an error code.
26if [[ ! -z $violation_output ]]; then
27	printf "COMMIT REJECTED (DNC violation): see below for details\n"
28	printf "$violation_output"
29	exit 1
30fi
31
32# If there are no violations, exit without an error code.
33exit 0
bash
Let's go over it:
1#!/usr/bin/env bash
2
3FORBIDDEN_PHRASES=('DNC' 'DO NOT COMMIT' 'TODO(PR)')
4
5# ...(OMITTED)
bash

πŸ‘† Here we define the list of forbidden phrases - customize these to your heart's desire.
You could extract these to a separate file if you were so inclined.

1# ...(OMITTED)
2
3# ANSI color codes.
4CLEAR="\033[0m"
5RED="\033[00;31m"
6BLUE="\033[00;34m"
7
8# ...(OMITTED)
bash

πŸ‘† Here we define the ANSI color codes, to get a beautiful colored prompt.

1# ...(OMITTED)
2
3violation_output=""
4
5# Go over the staged changes, and if any of the forbidden phrases is used,
6# add an appropriate message to the output.
7for forbidden_word in "${FORBIDDEN_PHRASES[@]}"; do
8	changed_file_names=$(git diff --cached --name-only)
9	for changed_file_name in $changed_file_names; do
10		changed_file_content=$(git diff HEAD --no-ext-diff -U0 --exit-code -a --no-prefix $changed_file_name | egrep "^\+")
11		if echo $changed_file_content | grep -q "$forbidden_word"; then
12			violation_output+="${CLEAR} β€’ ${BLUE}${changed_file_name}${CLEAR} contains ${RED}\"$forbidden_word\"${CLEAR}\n"
13		fi
14	done
15done
16
17# ...(OMITTED)
bash

πŸ‘† Here we go over the new code changes and check if they contain any of the aforementioned DNC phrases.
If so, we collect the violations into violation_output.

1# ...(OMITTED)
2
3# If there are any violations, print the output and exit with an error code.
4if [[ ! -z $violation_output ]]; then
5	printf "COMMIT REJECTED (DNC violation): see below for details\n"
6	printf "$violation_output"
7	exit 1
8fi
9
10# If there are no violations, exit without an error code.
11exit 0
12
13# ...(OMITTED)
bash

πŸ‘† Here we check output to determine if the commit can be approved or not (and if not, we print out the violations).

Note that the exit code is used by git to determine whether the commit can be applied or not.

Pre-commit hook (PowerShell)

Below is a PowerShell (pwsh, AKA PowerShell 7+) version. It is similar in structure to the BASH version, so I will avoid going into detail with this one.

Note that:

  • On Windows, you'll need to name this file pre-commit.ps1 (with the .ps1 extension), though I admit I haven't tried on a Windows system
  • On Linux/macOS (yes, PowerShell Core is cross-platform, how awesome is that?), you'll need to add a shebang (e.g., #!/usr/bin/env pwsh)
1# Forbidden phrases.
2$FORBIDDEN_PHRASES = "DNC", "DO NOT COMMIT", "TODO(PR)"
3
4$violations = ""
5
6# Go over the staged changes, and if any of the forbidden phrases is used,
7# add an appropriate message to the output.
8foreach ($forbiddenWord in $FORBIDDEN_PHRASES) {
9	$changedFileNames = "$(git diff --cached --name-only)"
10	foreach ($changedFileName in $changedFileNames) {
11		$changedFileContent = $(git diff HEAD --no-ext-diff -U0 --exit-code -a --no-prefix $changedFileName | Select-String "^\+")
12		foreach ($line in $changedFileContent) {
13			if ($line -Match $forbiddenWord) {
14				$violations += "$($PSStyle.Reset) β€’ "
15				$violations += "$($PSStyle.Foreground.Blue)${changedFileName} contains: $($PSStyle.Reset)"
16				$violations += "$($PSStyle.Foreground.Red)${forbiddenWord}$($PSStyle.Reset)"
17				$violations += "`n" # PowerShell newline. Yes... I know Β―\_(ツ)_/Β―
18			}
19		}
20	}
21}
22
23# If there are any violations, print the output and exit with an error code.
24if ($violations) {
25	Write-Host "COMMIT REJECTED DNC violation): see below for details"
26	Write-Host $violations
27	exit 1
28}
29
30# If there are no violations, exit without an error code.
31exit 0
powershell
Key takeaways
  • You can add your own pre-commit scripts which go over your changes and check for DNC violations
  • On UNIX systems, you'll need to chmod +x
  • The scripts can be BASH, PowerShell (which requires a .ps1 extension on Windows), Python or what have you. On UNIX systems, specify your interpreter with a shebang (#!)
  • I've supplied a BASH script and a PowerShell script - you can see both above or on this GitHub Gist

Conclusion

So, my friends, we have seen how we can leverage git hooks to spare our code reviewers, and add a simple automation by leveraging a nice (and not that well-known, surprisingly) feature in our daily VCS tool.

See you at the next one!

P.S. Recently added an RSS feed to this blog, if it's useful to anyone.

πŸ™Credits

Thanks to the awesome Michael Kuritzky Bakman for reviewing this blog post!