Last Updated: February 25, 2016
·
1.41K
· mattboehm

A year of git diff stats

I wanted to review the changes that I made to a codebase in the past year, so I did the following:

First, I got all the hashes of my commits from the last year:

git log --author=mboehm --since=1/1/2013 --until=12/31/2013 --cherry --pretty=format:"%h"

If you want, you can stick a %s in the format to read the statuses and confirm that things look okay.

Note that I'm using --cherry to skip merges.

Then I looped over the hashes, and for each one, I computed the diff stat:

for HASH in $(git log --author=mboehm --since=1/1/2013 --until=12/31/2013 --cherry --pretty=format:"%h"); do   git diff --stat=250,999 $HASH "$HASH~1"; done

The stat=250,999 means that I want the filename to be up to 250 characters and the line width to be up to 999. This is a hack I employ, because without it, git will try to abbreviate the filenames in the output.

This output contains summary lines and changes to binary files. Let's remove those by piping this to:

sed -e '/ Bin /d' -e '/^ [0-9]\+ file/d'

I also decided to normalize the number of spaces before the pipe separator:

sed -e 's/ \+|/    |/'

Then I piped the output through sort so it would be grouped by file (some may prefer to leave it chronological)

Let's put all that together:

for HASH in $(git log --author=mboehm --since=1/1/2013 --until=12/31/2013 --cherry --pretty=format:"%h"); do   git diff --stat=250,999 $HASH "$HASH~1"; done | sed -e '/ Bin /d' -e '/^ [0-9]\+ file/d' -e 's/ \+|/    |/' | sort | less

You may want to go through this output and delete any third-party libraries you added as they can be quite noisy. Luckily, since the output is sorted, these are usually grouped together.