Last Updated: April 27, 2022
·
10.27K
· errordeveloper

Git can do JSON (almost)

If you wish to do some processing of git logs using JSON as an intermediate format, you can add this to the [alias] section of you ~/.gitconfig:

json = log --format='{ \"hashes\":{ \"commit\":\"%H\", \"tree\":\"%T\", \"parents\":\"%P\" }, \"author\":{ \"date\": \"%ai\", \"name\": \"%an\", \"email\":\"%ae\" }, \"committer\":{ \"date\": \"%ci\", \"name\": \"%cn\", \"email\":\"%ce\" } }'

This will give you an object-per-line without commas, it will not contain the commit message body or subject, since that is's not mean for humans to read (if you need it, add \"message\":\"%B\" at your own risk of dealing with \ns).

To get the JSON output serialised into an array, do this:

git --no-pager json | awk \
  'BEGIN { print("[") } \
  { print($0",") } \
   END { print("\"undefined\"]") }'

The man page doesn't say whether there is or there isn't a special character placeholder character for getting the metris of LOC or files being added/deleted, though there must be some hacky way of getting it out. Let me know if you do find this one out!

4 Responses
Add your response

That was a good take!

The most straight-forward way to output stats (through --shortstat) is cumbersome as it gets printed on a new line. As some commits don't have any stats, the new line pattern breaks, making it difficult to port the data reliably. To compensate for that, I had to use several bash commands besides the git log (sed, tr, paste), as you also realised was necessary to use awk...

My solution involves two scripts and is a bit too lengthy to post here, so I'm posting a link instead:

https://github.com/dreamyguy/gitlogg

Some of Gitlogg's features are:

  • Parse the git log of multiple repositories into one JSON file.
  • Introduced repository key/value.
  • Introduced files changed, insertions and deletions keys/values.
  • Introduced impact key/value, which represents the cumulative changes for the commit (insertions - deletions).
  • Sanitise double quotes " by converting them to single quotes ' on all values that allow or are created by user input, like subject.
  • Nearly all the pretty=format: placeholders are available.
  • Easily include / exclude which keys/values will be parsed to JSON by commenting out/uncommenting the available ones.
  • Easy to read code that's thoroughly commented.
  • Script execution feedback on console.
  • Error handling (since path to repositories needs to be set correctly).
over 1 year ago ·

You can use \"subject\":\"%s\" for the first line of the commit message and not have to handle newlines.

over 1 year ago ·

JC now has a git log parser that supports various pretty formats and --stat and --shortstat options.

https://kellyjonbrazil.github.io/jc/docs/parsers/git_log

over 1 year ago ·