Last Updated: February 25, 2016

Mongolian Summary

Lately we've been moving our data processing towards a more stream-based approach. Unfortunately, some of our data is collection-wide aggregates like min/max/sum/mean.

Running the aggregate using the Mongo aggregation framework or map/reduce is just too slow for us, and frankly seems like a real waste. The obvious solution to this is to maintain a summary table, but doing it on a per-case basis is a bit of PITA.

So I wrote Mongolian Summary (https://npmjs.org/package/mongosum), a wrapper around Mongolian Deadbeef (https://npmjs.org/package/mongolian) that automatically handles the painful business of maintaining that sort of aggregate on your data.

It's a rudimentary tool for now, but we're using it in (almost) production code so as we need features we can add them. First on the list, I think, is only summarizing selected collections. Also, adding tests!

Basically, whenever data is inserted/updated/deleted, this module tracks the types of all columns and the sum, min, and max of the numeric fields. This way, their is no cost (on retrieval) for getting this data. Just call collection.getSummary(callback) and you have the data!

So, download today and get "free" sum, min, max, and mean on every numerical field in your collection!