MySQL Query Optimization
Many PHP programmers who I have come across are unsure on how to optimize their MySQL queries.
I'm just going to provide a few tips on how to eliminate most causes of slow queries (on basic queries, I will go into more complex queries in another protip)
Always provide exactly what you want to retrieve back from the query, Never just select everything (*)
The reason for this is that if you don't use every bit of information which you are pulling back then it is wasted time and resources on doing this, another reason for doing this is to prevent the DB engine from enumerating the fields itself. It will know exactly what to get from what you have provided.The next thing is (where applicable) always limit your query results to a small set if you can. As you can see in the following example, you will only ever want one row returned as there should only ever be one user with the ID of 1. This again stops MySQL (in this case) from pulling back more data than needed (this is not the case in the below example as, again, there will only ever been one user with the ID of 1)
Unoptimized:
SELECT *
FROM users
WHERE users.id = 1;
Optimized
SELECT username, password, email, first_name, last_name
FROM users
WHERE users.id = 1
LIMIT 1;
I will get into the more complex queries, such as Joins and sub queries next protip.
Written by Richard Clifford
Related protips
3 Responses
The first optimization feels potentially counter-productive. How much time and resources are REALLY being saved by selecting exactly what you need and not using *?
Compare this to the tradeoff: not using * introduces development overhead. If a column is changed, added, or removed, the query will have to be rewritten.
If you're developing for the web, then there are several other performance enhancements I would try before "selectively selecting." Eager loading, query caching, view caching (I come from a Rails background).
However, if you're writing some sort of background process that is fetching lots of data and needs a performance boost, I can see the value of foregoing SELECT *.
@smidwap A database schema once setup correctly shouldn't need to be changed, and if it does then you should be updating the version of the software which you are 'releasing', which means you should update the code to a newer version too.
It is a well known performance issue to not use * as it takes MySQL a while enumerate which columns are in that table and if you are not using all of the data that it brings back then it is wasted time for the server to process.
If I had a table with the following columns in it:
id, username, password, email, sex, dob, city, country, avatar
and I just wanted to check whether the username exists in the database, there would be no need for me to select * when I could just select username/id.
I couldn't agree with you more on the query caching and view caching, however I did state in the start of this 'ProTip' that it was a very basic query optimisation technique and that I will be going deeper into it when I have a moment to write up a good article.
Anyway, thanks for your feedback, much appreciated!
@smidwap for the data between your database -> application it can be a huge difference depending on the datatypes/sizes in each row. And this is compounded if the database -> application aren't on the same machine using sockets, and are using TCP/IP for the transit.
Your point on query caching makes this even worse, since your cache sizes are going to full up much quicker, since you run the risk of caching more data than you actually need/use - which could be considered wasteful.