Last Updated: February 25, 2016
·
3.344K
· goranhalusa

Random Rows From MySQL

It’s tempting to simply use MySQL’s RAND() function to select random records from a table. Consider the following query:

SELECT FLOOR(7 + (RAND() * 5));

However, when considering overhead, this is a rather expensive method to employ. This becomes glaringly apparent when dealing with large datasets.

One effective solution is to select the first and last ids, and let PHP generate the random integer (id). Then, simply check to see if the record exists using the random integer. If it exists, then add the record to an array of records.

public function getRandomRecords()
{
  // Select the first id in the target table
  $statement = $this->db->prepare("SELECT some_id
  FROM table_name
  ORDER BY some_id ASC LIMIT 1");
  $statement->execute();
  $lowest_id = $statement->fetch(PDO::FETCH_ASSOC);

  // Select the last id in the target table
  $statement = $this->db->prepare("SELECT some_id
  FROM table_name
  ORDER BY some_id DESC LIMIT 1");
  $statement->execute();
  $highest_id = $statement->fetch(PDO::FETCH_ASSOC);

  $records_array = array();

  while(true)
  {
    // Generate a random integer
    $random_id = rand( $lowest_id['some_id'], $highest_id['some_id'] );

    // Check to see if the record exists
    $statement = $this->db->prepare("SELECT col_one, col_two, etc...
    FROM table_name
    WHERE some_id = {$random_id}";
    $statement->execute();
    $result = $statement->fetchAll(PDO::FETCH_ASSOC);

    // If it exists, add it to the array
    if($result) {
      $records_array[] = $record;
    }

    $i++;

    // If the array contains 5 records, stop
    if(count($records_array) == 5) {
      break;
    }
  }

  return $records_array;
}

4 Responses
Add your response

Full table dataset from start to finish? I don't think so. I'm using 'LIMIT 1' on the first two. On the last one, I'm checking to see if one id exists.

However, you may be right in saying I'm over thinking it. I like your idea of pulling it all into an array and using shuffle(). On the other hand, having a large amount of data (e.g 1,000,000 rows) in a PHP array is memory intensive. The point here is, as I've stated, "... when dealing with large datasets." I've built large arrays in PHP, and it's not pretty. Sooner or later, your buffer fills up.

Think about it before posting a semi-ignorant and somewhat rude comment.

over 1 year ago ·

@goranhalusa Looking back over it now, it's actually not that bad of a solution. I don't try to make excuses. You were right, and I was wrong. But I certainly wasn't trying to be rude.

over 1 year ago ·

It's ok David, no offense taken. I've been programming too long to sweat the small stuff.

One note:

You must always take into consideration that the id of the first record may not be 1, and that there may be deletions. Not all scenarios offer table id entries which are auto-incremented values in pure, strict sequential order.

over 1 year ago ·

Hey guys, I missed the point of needing the max and min table id, but if so, why didn't you use 'SELECT MAX(id), MIN(id) from table' ?
Any how to get 5 random rows
'SELECT *, rand() as rand FROM table_name
ORDER BY rand
LIMIT 5'

over 1 year ago ·