Last Updated: February 25, 2016

ActiveRecord empty? vs count

During a discussion with a coworker about optimization of queries over rails, I need to prove my theory that count is faster than empty? because on the first method active record its making directly over sql and in the second is verifying the length of the given array by the result.

My Enviroment

ruby 1.9.3-p448
rails 3.2.6

So in order to prove it I've made the next benchmark

Code

Benchmark.bm do |bm|

  bm.report("count") do
    10.times{ User.where('id>0').count<0 }  
  end

  bm.report("empty?") do
    10.times{ User.where('id>0').empty? }
  end

end

here the gist

Console

And my surprise was that both queries use COUNT(*)

count

[DEBUG]    (0.6ms)  SELECT COUNT(*) FROM "users" WHERE "users"."deleted_at" IS NULL AND (id > 0)
[DEBUG]    (0.6ms)  SELECT COUNT(*) FROM "users" WHERE "users"."deleted_at" IS NULL AND (id > 0)
[DEBUG]    (0.4ms)  SELECT COUNT(*) FROM "users" WHERE "users"."deleted_at" IS NULL AND (id > 0)
[DEBUG]    (0.5ms)  SELECT COUNT(*) FROM "users" WHERE "users"."deleted_at" IS NULL AND (id > 0)
[DEBUG]    (0.5ms)  SELECT COUNT(*) FROM "users" WHERE "users"."deleted_at" IS NULL AND (id > 0)
[DEBUG]    (0.4ms)  SELECT COUNT(*) FROM "users" WHERE "users"."deleted_at" IS NULL AND (id > 0)
[DEBUG]    (0.4ms)  SELECT COUNT(*) FROM "users" WHERE "users"."deleted_at" IS NULL AND (id > 0)
[DEBUG]    (0.4ms)  SELECT COUNT(*) FROM "users" WHERE "users"."deleted_at" IS NULL AND (id > 0)
[DEBUG]    (0.4ms)  SELECT COUNT(*) FROM "users" WHERE "users"."deleted_at" IS NULL AND (id > 0)
[DEBUG]    (0.4ms)  SELECT COUNT(*) FROM "users" WHERE "users"."deleted_at" IS NULL AND (id > 0)

empty?

[DEBUG]    (0.4ms)  SELECT COUNT(*) FROM "users" WHERE "users"."deleted_at" IS NULL AND (id>0)
[DEBUG]    (0.5ms)  SELECT COUNT(*) FROM "users" WHERE "users"."deleted_at" IS NULL AND (id>0)
[DEBUG]    (0.4ms)  SELECT COUNT(*) FROM "users" WHERE "users"."deleted_at" IS NULL AND (id>0)
[DEBUG]    (0.3ms)  SELECT COUNT(*) FROM "users" WHERE "users"."deleted_at" IS NULL AND (id>0)
[DEBUG]    (0.4ms)  SELECT COUNT(*) FROM "users" WHERE "users"."deleted_at" IS NULL AND (id>0)
[DEBUG]    (0.3ms)  SELECT COUNT(*) FROM "users" WHERE "users"."deleted_at" IS NULL AND (id>0)
[DEBUG]    (0.4ms)  SELECT COUNT(*) FROM "users" WHERE "users"."deleted_at" IS NULL AND (id>0)
[DEBUG]    (0.3ms)  SELECT COUNT(*) FROM "users" WHERE "users"."deleted_at" IS NULL AND (id>0)
[DEBUG]    (0.3ms)  SELECT COUNT(*) FROM "users" WHERE "users"."deleted_at" IS NULL AND (id>0)
[DEBUG]    (0.4ms)  SELECT COUNT(*) FROM "users" WHERE "users"."deleted_at" IS NULL AND (id>0)

Benchmark Result

          user     system      total      real
count   0.020000   0.000000   0.020000 (  0.023109) 
empty? 0.020000   0.000000   0.020000 (  0.018685)

Conclusion

Empty? method is a little bit faster than count, so don't be afraid to use it, specially if you call it several times. My concern and I hope that someone can answer why is faster and why is calling COUNT(*) because it seems to me that ActiveRecord is doing all the magic.

Update 29/07/2013

In order to confirm my doubts I run the same tests over the stage database and with a large sample for 1000 times. And besides the slight differences empty? still beating count on real time.

          user     system      total      real
count   2.370000   0.100000   2.470000 ( 24.989100)
empty? 2.470000   0.090000   2.560000 ( 24.867949)

#rails

#activerecord

#benchmark

Written by Luis Ezcurdia

Say Thanks

Respond

Related protips

Rails 4: How to partials & AJAX, dead easy

193.9K

Ruby on Rails 4 - Authentication with Facebook and OmniAuth.

101K

Open a rails form with Twitter Bootstrap modals

88.34K

6 Responses

Add your response

job

It'd be nice to know why .empty? is faster. Maybe try increasing your sample size and averaging the result for a more convincing result.

It's always nice to know more about optimisation of AR queries!

over 1 year ago ·

mariochavez

I ran same benchmark few more times (1000) the SQL query was the same, both issue a COUNT(*) on the users table. My setup was Ruby 2.0.0-p195 with Rails 4.0.0 on a Postgresql DB, I was running a Rails console with production env.

Benchmark.bm do |bm|
  bm.report("count") do
    1000.times{ Identity.where('id>0').count }
  end
  bm.report("empty?") do
    1000.times{ Identity.where('id>0').empty? }
  end
end

The difference was really minimal like:

       user     system      total        real
count  0.280000   0.030000   0.310000 (  0.437411)
empty?  0.280000   0.030000   0.310000 (  0.428984)

But after a few runs count was faster then empty? was:

       user     system      total        real
count  0.290000   0.030000   0.320000 (  0.427957)
empty?  0.310000   0.020000   0.330000 (  0.454032)

So I don't think that the difference between them are related to the way they are coded on Rails, but to the database or the system at all.

count is a method on ActiveRecord::Calculations that issue the COUNT query to the database.

https://github.com/rails/rails/blob/master/activerecord/lib/active_record/relation/calculations.rb#L22

empty? is a method on ActiveRecord::Relation, it first check if the relation has been loaded from the database, if it was then #empty? is evaluated against it, there is no query sent to the database; if it wasn't loaded then, guess what? it sends #count message to ActiveRecord::Calculation.

https://github.com/rails/rails/blob/master/activerecord/lib/active_record/relation.rb#L244

So at the end if #empty? needs to go to the database it goes thought a #count.

over 1 year ago ·

miguelmadero

Mario, answering one of your questions, And my surprise was that both queries use COUNT(*), empty? will only do that if you haven't loaded the results.

This is the implementation:

def empty?
  return @records.empty? if loaded?

  c = count
  c.respond_to?(:zero?) ? c.zero? : c.empty?
end

This is good to know, because you could preload them if you know you would use them, then empty? would be faster because it won't result in a DB query. Something else that's important to consider is caching this if you plan to use it more than once per request to minimize the DB calls.

over 1 year ago ·

miguelmadero

Ooops, I just read your last comment and I noticed you found that answer.

over 1 year ago ·

mariochavez

Yes, I did it :)

over 1 year ago ·