Last Updated: January 28, 2019
·
8.136K
· pvande

Find Your Most Prolific Contributors

Open source projects can get large fast, and the problem only gets worse as you add contributors. Git makes it fairly easy to find out who your most common committers are…

$ git shortlog -ns
   121  Pieter van de Bruggen
    21  Daniel Sauble
    18  Randall Hansen
     1  Matt Robinson

…but finding out who has authored the greatest number of lines of your project is a little bit trickier. It's still possible, though.

$ cat git-authorship
git ls-files -z |
xargs -0 -n1 -E'\n' -J {} git blame --date short -wCMcp '{}' |
perl -pe 's/^.*?\((.*?) +\d{4}-\d{2}-\d{2} +\d+\).*/\1/' |
sort |
uniq -c |
sort -rn

$ bash git-authorship
  1397  Pieter van de Bruggen
    59  Matt Robinson
    54  Daniel Sauble

It's worth noting, this invocation of xargs is specific to BSD xargs e.g. on OS X; for UNIX xargs, the flags will be a little different. I'm also less than perfectly happy with relying on Perl for this, but hey -- I'm pragmatic.

Neither of these metrics is a good indicator of productivity; like any other measurable artifact, they can be gamed fairly easily. Be very careful about using these metrics punitively, or you'll get what you deserve.

2 Responses
Add your response

Great. Here is as a Git alias:
authorship = "!git ls-files -z|xargs -0 -n1 -E'\n' -J {} git blame --date short -wCMcp '{}'| perl -pe 's/^.?\((.?) +\d{4}-\d{2}-\d{2} +\d+\).*/\1/'| sort | uniq -c | sort -rn"

over 1 year ago ·

This only works if your file structure has never changed, as git ls-files only shows those files that exist in the current checkout. So if someone made a lot of changes to a file, but that file was subsequently deleted, it won't show up. Even worse, if the file was just renamed, all the changes will show up as from the person who renamed the file! I wonder how hard it would be to write a version of this that uses git's pickaxe feature, which avoids these shortcomings.

over 1 year ago ·