Find Your Most Prolific Contributors
Open source projects can get large fast, and the problem only gets worse as you add contributors. Git makes it fairly easy to find out who your most common committers are…
$ git shortlog -ns
121 Pieter van de Bruggen
21 Daniel Sauble
18 Randall Hansen
1 Matt Robinson
…but finding out who has authored the greatest number of lines of your project is a little bit trickier. It's still possible, though.
$ cat git-authorship
git ls-files -z |
xargs -0 -n1 -E'\n' -J {} git blame --date short -wCMcp '{}' |
perl -pe 's/^.*?\((.*?) +\d{4}-\d{2}-\d{2} +\d+\).*/\1/' |
sort |
uniq -c |
sort -rn
$ bash git-authorship
1397 Pieter van de Bruggen
59 Matt Robinson
54 Daniel Sauble
It's worth noting, this invocation of xargs
is specific to BSD xargs
e.g. on OS X; for UNIX xargs
, the flags will be a little different. I'm also less than perfectly happy with relying on Perl for this, but hey -- I'm pragmatic.
Neither of these metrics is a good indicator of productivity; like any other measurable artifact, they can be gamed fairly easily. Be very careful about using these metrics punitively, or you'll get what you deserve.
Written by Pieter van de Bruggen
Related protips
2 Responses
Great. Here is as a Git alias:
authorship = "!git ls-files -z|xargs -0 -n1 -E'\n' -J {} git blame --date short -wCMcp '{}'| perl -pe 's/^.?\((.?) +\d{4}-\d{2}-\d{2} +\d+\).*/\1/'| sort | uniq -c | sort -rn"
This only works if your file structure has never changed, as git ls-files only shows those files that exist in the current checkout. So if someone made a lot of changes to a file, but that file was subsequently deleted, it won't show up. Even worse, if the file was just renamed, all the changes will show up as from the person who renamed the file! I wonder how hard it would be to write a version of this that uses git's pickaxe feature, which avoids these shortcomings.