Filter out Bot Activity for the public_activity Gem
public_activity is a great gem for automatically tracking activity on your model records, and Ryan Bates has just done a RailsCast on it. In a nut shell publicactivity allows you to automatically track creates, updates, and destroys of model records. You can also use it to track your own events using the `PublicActivity::Model#createactivity` method.
We're using this to track record views so that we can keep a track of who's viewing what, which means the numbers can be skewed by bots visiting controller actions that trigger PublicActivity::Model#create_activity
without authentication.
There are a number of gems that detect bots, we've found them to be not so great so just manage our own method on our ApplicationController
. Ours looks like this, but you can substitute this for you bot detection method of choice:
def is_bot? user_agent = nil
!Rails.env.test? and (user_agent or request.user_agent).match(/asynchttpclient|alexa|bot|butterfly\/\d.\d|crawl(er|ing)|crowsnest|curl|embedly|eventmachine|facebookexternalhit|feedburner|flipboardproxy|firephp|google web preview|(^java)|LongURL|MetaURI|nagios|(^NING\/\d.\d$)|news\.me|peerindex|pingdom|postrank|(^python)|rockmeltembedservice|(^ruby$)|slurp|spider|viadeo|yahoo!|yandex/i)
end
Its fairly easy to maintain but does need reviewing from time to time.
Once we have that in place we basically need to override PublicActivity::Model#public_activity_enabled?
to also call our #is_bot?
method and ensure the result is false. We could monkey patch PublicActivity::Activity
but personally I prefer to implement my own module that includes it, apart from avoiding a monkey patch it allows me to define custom field defaults across my models and define a hook to push activities to mixpanel. It looks like this:
module ActiveRecord
module EventTracking
extend ActiveSupport::Concern
include PublicActivity::Model
# Override to only track if it isn't a bot
def public_activity_enabled?
PublicActivity.enabled? && (!PublicActivity.get_controller || !PublicActivity.get_controller.is_bot?)
end
end
end
First we check that we have a controller as its entirely possibly that our model is initiated outside the context of a controller (model unit tests for example) then we ask it if the current request is from a bot.
Then all that remains is to include ActiveRecord::EventTracking
instead of PublicActivity::Model
on your ActiveRecord classes.
I will write up later how I push activities to Mixpanel and define default public_activity values across multiple model classes.
Written by Matthew Spence
Related protips
1 Response
This is a fairly aggressive regex. I ran it against my order history and had numerous orders from real world people come through with Alexatoolbar and 360Spider in the user agents. Also, you might want to add more terms to catch RSS and ATOM readers, such as feed fetcher and Apple-PubSub.