Last Updated: February 25, 2016
·
2.379K
· msaspence

Filter out Bot Activity for the public_activity Gem

public_activity is a great gem for automatically tracking activity on your model records, and Ryan Bates has just done a RailsCast on it. In a nut shell publicactivity allows you to automatically track creates, updates, and destroys of model records. You can also use it to track your own events using the `PublicActivity::Model#createactivity` method.

We're using this to track record views so that we can keep a track of who's viewing what, which means the numbers can be skewed by bots visiting controller actions that trigger PublicActivity::Model#create_activity without authentication.

There are a number of gems that detect bots, we've found them to be not so great so just manage our own method on our ApplicationController. Ours looks like this, but you can substitute this for you bot detection method of choice:

def is_bot? user_agent = nil
  !Rails.env.test? and (user_agent or request.user_agent).match(/asynchttpclient|alexa|bot|butterfly\/\d.\d|crawl(er|ing)|crowsnest|curl|embedly|eventmachine|facebookexternalhit|feedburner|flipboardproxy|firephp|google web preview|(^java)|LongURL|MetaURI|nagios|(^NING\/\d.\d$)|news\.me|peerindex|pingdom|postrank|(^python)|rockmeltembedservice|(^ruby$)|slurp|spider|viadeo|yahoo!|yandex/i)
end

Its fairly easy to maintain but does need reviewing from time to time.

Once we have that in place we basically need to override PublicActivity::Model#public_activity_enabled? to also call our #is_bot? method and ensure the result is false. We could monkey patch PublicActivity::Activity but personally I prefer to implement my own module that includes it, apart from avoiding a monkey patch it allows me to define custom field defaults across my models and define a hook to push activities to mixpanel. It looks like this:

module ActiveRecord
  module EventTracking

    extend ActiveSupport::Concern
    include PublicActivity::Model

    # Override to only track if it isn't a bot
    def public_activity_enabled?
      PublicActivity.enabled? && (!PublicActivity.get_controller || !PublicActivity.get_controller.is_bot?)
    end

  end
end

First we check that we have a controller as its entirely possibly that our model is initiated outside the context of a controller (model unit tests for example) then we ask it if the current request is from a bot.

Then all that remains is to include ActiveRecord::EventTracking instead of PublicActivity::Model on your ActiveRecord classes.

I will write up later how I push activities to Mixpanel and define default public_activity values across multiple model classes.

1 Response
Add your response

This is a fairly aggressive regex. I ran it against my order history and had numerous orders from real world people come through with Alexatoolbar and 360Spider in the user agents. Also, you might want to add more terms to catch RSS and ATOM readers, such as feed fetcher and Apple-PubSub.

over 1 year ago ·