muvy9w
Last Updated: October 30, 2017
·
3.578K
· jjmars
Jm

Forget counter cache for many-to-many associations in Rails

I like counter cache in Active Record.

Counting associations represents a performance improvement by avoiding eager loading and N+1 queries for listings -- e.g. the index page for Posts that shows how many Comments each Post has.

However I've had a tough time trying to use counter cache in many-to-many associations, both with has_many through and has_and_belongs_to_many. After trying for a while, I've come to the conclusion that it is best not to use counter cache for these types of associations. It is terrible for performance because it generates several SQL queries that could be cut down to just 4 or less queries.

I'll explain with examples.

The scenario

Let's say we have posts with tags through taggings. Posts keep the count of their tags:

class Tagging
  # FIELDS: post_id, tag_id
  belongs_to :tag
  belongs_to :post, counter_cache: :tags_count # updates tags_count in Post
end

class Tag
  # FIELDS: title
  has_many :taggings
  has_many :posts, through: :taggings, dependent: :destroy
end

class Post
  # FIELDS: content, tags_count
  has_many :taggings
  has_many :tags, through: :taggings, dependent: :destroy
end

Note that dependent: :destroy will destroy taggings, not tags or posts. Now the tags_count field will be automatically updated any time a post is created, updated or deleted.

Bad news: multiple SQL updates

Counter cache works by performing an SQL update each time a new association is created or deleted.

When a new post is created with a few tags (actually 99 tags for the sake of example), Active Record will perform multiple updates:

INSERT INTO posts (content) VALUES ("Lorem ipsum")

INSERT INTO taggings (post_id, tag_id) VALUES (1, 1);
UPDATE posts SET tags_count = tags_count + 1 WHERE posts.id = 1;

INSERT INTO taggings (post_id, tag_id) VALUES (1, 2);
UPDATE posts SET tags_count = tags_count + 1 WHERE posts.id = 1;

...

INSERT INTO taggings (post_id, tag_id) VALUES (1, 99);
UPDATE posts SET tags_count = tags_count + 1 WHERE posts.id = 1;

-- ======================
-- TOTAL QUERIES: 2×N + 1
-- ======================

When destroying a single tag that is associated with many (99) posts:

SELECT * FROM posts INNER JOIN posts_tags ON posts.id = posts_tags.post_id WHERE posts_tags.tag_id = 1

DELETE FROM posts_tags WHERE posts_tags.tag_id = 1 AND posts_tags.post_id = 1
UPDATE posts SET tags_count = tags_count - 1 WHERE posts.id = 1;

DELETE FROM posts_tags WHERE posts_tags.tag_id = 1 AND posts_tags.post_id = 2
UPDATE posts SET tags_count = tags_count - 1 WHERE posts.id = 2;

...

DELETE FROM posts_tags WHERE posts_tags.tag_id = 1 AND posts_tags.post_id = 99
UPDATE posts SET tags_count = tags_count - 1 WHERE posts.id = 99;

DELETE FROM tags WHERE tags.id = 1

-- ======================
-- TOTAL QUERIES: 2×N + 2
-- ======================

This is clearly detrimental for objects with dozens or hundreds of associations.

Replace counter cache with callbacks

I've decided to quit the automatic counter cache for many-to-many associations. Let's use callbacks instead:

class Post
  # FIELDS: content, total_tags
  has_and_belongs_to_many :tags
  before_save :update_total_tags

  def update_total_tags
    self.total_tags = tag_ids.count
  end
end

class Tag
  # FIELDS: title
  has_and_belongs_to_many :posts
  before_destroy :update_posts

  def update_posts
    Post.where(id: post_ids).update_all('total_tags = total_tags - 1')
  end
end

Note that I replaced the taggings table with a posts_tags join table. I also replaced tags_count with total_tags to avoid automatic counter caching due to naming conventions. This automatic behavior is thankfully deprecated and will be gone in Rails 5.

By using before_save we make sure to set the total count before the INSERT statement (thus avoiding a late UPDATE). By counting tag_ids we avoid an additional COUNT query on tags. Let's see how the same examples presented before perform now.

When the user creates a post with many tags:

INSERT INTO posts (content, total_tags) VALUES ("Lorem ipsum", 99)
INSERT INTO posts_tags (post_id, tag_id) VALUES (1, 1);
INSERT INTO posts_tags (post_id, tag_id) VALUES (1, 2);
...
INSERT INTO posts_tags (post_id, tag_id) VALUES (1, 99);
-- ======================
-- TOTAL QUERIES: N + 1
-- ======================

IMPROVE ALERT: this could be further reduced to just 2 queries by bulk-inserting into posts_tags. If anyone out there knows how to do that in Active Record, please comment.

When destroying a single tag associated with many posts:

SELECT posts.id FROM posts INNER JOIN posts_tags ON posts.id = posts_tags.post_id WHERE posts_tags.tag_id = 1
UPDATE posts SET total_tags = total_tags - 1 WHERE posts.id IN (1, 2, ..., 99)
DELETE FROM posts_tags WHERE posts_tags.tag_id = 1
DELETE FROM tags WHERE tags.id = 1
-- ======================
-- TOTAL QUERIES: 4
-- ======================

Conclusion

Please DO use counter cache for one-to-many associations only. It's perfect for these situations!

But after all this it seems pretty obvious that the built-in counter cache feature is not suitable for many-to-many associations (maybe this was already obvious to many developers out there). I decided to write this post anyway to advice others who are considering going down that road. I hope I convinced you guys otherwise :)

Say Thanks
Respond