Deserializing Complex Data from Redis with Ruby
I recently used redis to memoize some heavy SQL results. The challenge was serializing/deserializing the data while keeping the Ruby code clean and readable.
A quick note about Redis
Redis is a data store that can be used to cache complex or costly data. It can be used to store data that is accessible across multiple threads and can even survive system restarts. For instance, my app has a dashboard that shows a pie chart with how many Accounts in each status (such as invited, confirmed, banned, etc.). There are over 34 million records. Counting is a trivial ActiveRecord task, but it's also expensive. There are 6 groups total, taking roughly 6 seconds each. Blocking the web thread for half a minute each time the dashboard is loaded or refreshed is a recipe for DDoS. So the counting is put into a worker, the results are "set" in redis by the worker, and the web thread can "get" it when needed.
Redis returns nil if the key you are looking for is empty or hasn't been set yet. Fine, I can deal.
Redis only stores string values. Fine, I can... wait a minute. I have an array of hashes I need to store.
Turns out the easiest way (for my purposes) to serialize/deserialize it is to and from JSON. So to set it into redis, I do this set:
$redis.set('account-stats', results.to_json)
...where results
is the array of hashes created elsewhere by ActiveRecord.
To get it back out:
results = $redis.set('account-stats') # NOPE!
As mentioned redis stores strings. So results is now a string, not an array. So we can do this:
results = JSON.parse($redis.set('account-stats')) # NOPE!
The problem here is that, as mentioned, Redis could return nil, and JSON.parse throws a "no implicit conversion from nil to string" exception if Redis returns nil. So try:
results = $redis.set('account-stats')
results = JSON.parse(results) if results
Now we're getting somewhere.
Cicuit Breaker and Memoization
It looks like this now:
def account_statistics
results = $redis.set('account-stats')
results = JSON.parse(results) if results
unless results
# DO A BUNCH OF WORK IN HERE AND SET results
$redis.set('account-stats', results.to_json)
end
results
end
However, we sure are checking the value of results in a lot of places. I like to use the circuit breaker pattern when memoizing values this way: if the method can return early, it should, rather than branching over the rest of the method. It's one factor of the "Tell Don't Ask" pattern that aims to eliminate code branching, technically referred to as "cyclomatic complexity". To implement it here, we just return immediately after setting results if it was set. We'll refactor to this:
def account_statistics
results = $redis.set('account-stats')
results = JSON.parse(results) if results
return results if results # <<== circuit breaker eliminates an if block
# DO A BUNCH OF WORK IN HERE AND SET results
$redis.set('account-stats', results.to_json)
end
That's better but we're still checking results in a lot of places. It also looks pretty ridiculous, like we're combining two methods into one, the first checking/returning from Redis, and failing that, the second is a bunch of ActiveRecord counting and storing. My criteria for employing the circuit breaker is to only use it if it can return after a single line.
The Refactor Game
So here we are, at the raggedy edge... and this is where things go to ruby-geekiness level-11. Let's blow it up in an intermediate refactor so we can see all the working parts:
def account_statistics
results = results = $redis.get('account-stats') and JSON.parse(results)
return results if results
# DO A BUNCH OF WORK IN HERE AND SET results
$redis.set('account-stats', results.to_json)
end
Whoa! That doesn't even read right. Can you even do that in Ruby? Well, yes, but that question is exactly why I don't like going to 11. However, let's break this down and see how just the first line works. Lets put brackets around Ruby's precedence of execution:
[results = [[results = $redis.get('account-stats')] and JSON.parse(results)]]
You can see the first part executed is:
results = $redis.get('account-stats')
That should make sense. The second part executed is:
and JSON.parse(results)
If you are not familiar with the differences between and
and &&
in Ruby, I encourage you to search the web. They are subtle differences but worth noting and keeping handy in your mental toolbox. In short, we've taken the results we got from Redis, and if not nil, we parse them out of JSON and into POR (plain-ol-ruby) objects. The final part is:
results = ...
Which should be obvious. Here it is again:
results = results = $redis.get('account-stats') and JSON.parse(results)
return results if results
We're "checking" results in only one place but we're setting in, what, twice? And in the same line?
Back to Sanity
Now that we know what's happening at 11, we can make one more refactor to dial it back closer to 10:
results = $redis.get('account-stats') and return JSON.parse(results)
We know the second part won't be executed if the first part evaluates to nil, so why don't we move the return
on the following line here instead. It also eliminate that left-most assignment. We made it a bit more readable, at least for some. In English, it says, "If Redis has a value for 'account-stats' return the parsed value, otherwise skip over this bit." Essentially, we've memoized the results for a taxing set of queries. Here is the final refactor in it's entirety:
def account_statistics
results = $redis.get('account-stats') and return JSON.parse(results)
# DO A BUNCH OF WORK IN HERE AND SET results
$redis.set('account-stats', results.to_json)
end
As a matter of course, results will be nil if Redis hasn't seen that value, but we don't even need to check it. We've implemented the "Tell, Don't Ask" pattern and almost completely eliminated any branching.