Article
Optimizing your MySQL Application
How to Pick Columns to Index
One of the most important steps in optimizing is selecting which columns to index. There are two major places you want to consider indexing: columns you reference in the WHERE clause and columns used in join clauses. Look at the following query:
SELECT
age ## no use indexing
FROM
people
WHERE
firstname='Mike' ## consider indexing
AND
lastname='Sullivan' ## consider indexing
This query is a little different from the past ones, but it’s still quite simple. Since “age” is referenced in the SELECT portion, MySQL will not use it to limit the chosen rows. Hence, there is no great need to index it. Here’s a more complex example:
SELECT
people.age, ## no use indexing
town.name ## no use indexing
FROM
people
LEFT JOIN
town
ON
people.townid=town.townid ## consider indexing
## town.townid
WHERE
firstname='Mike' ## consider indexing
AND
lastname='Sullivan' ## consider indexing
The possibility of indexing firstname and lastname carries over as they are again located in the WHERE clause. An additional field you'll want to consider indexing is the townid field from town table (please note that I’m only using the town table as an example of a join) because it is in a join clause.
“So I simply consider indexing every field in the WHERE clause or a join clause?” Almost, but not quite. Next, you need to consider the type of comparisons your doing on the fields. MySQL will only use indexes for '<', '<=', '=', '>', '>=', BETWEEN, IN, and some LIKE operations. These specific LIKE operations are times where the first character is not a wildcard (% or _). SELECT peopleid FROM people WHERE firstname LIKE 'Mich%'; would use an index, but SELECT peopleid FROM people WHERE firstname LIKE '%ike'; wouldn’t.
Analyzing Index Efficiency
You have some ideas on which indexes to use, but you’re not sure which is the most efficient. Well, you’re in luck, because MySQL has a built-in SQL statement to do this, known as EXPLAIN. The general syntax for this is EXPLAIN select statement;. You can find more information in the MySQL documentation. Here’s an example:
EXPLAIN SELECT peopleid FROM people WHERE firstname='Mike'
AND lastname='Sullivan' AND age='17';
This will return a somewhat cryptic result that will look usually look similar to this:
[Note: table split across two rows for readability]
+--------+------+-----------------+-----------------+
| table | type | possible_keys | key |
+--------+------+-----------------+-----------------+ ...
| people | ref | fname_lname_age | fname_lname_age |
+--------+------+-----------------+-----------------+
+---------+-------------------+------+------------+
| key_len | ref | rows | Extra |
... +---------+-------------------+------+------------+
| 102 | const,const,const | 1 | Where used |
+---------+-------------------+------+------------+
Let's break this down column by column.
- table - This is the name of the table. This will become important when you have large joins, as each table will get a row.
- type - The type of the join. Here's what the MySQL documentation has to say about the
reftype:All rows with matching index values will be read from this table for each combination of rows from the previous tables. ref is used if the join uses only a leftmost prefix of the key, or if the key is not
In this case, since our index isn’tUNIQUEor aPRIMARY KEY(in other words, if the join cannot select a single row based on the key value). If the key that is used matches only a few rows, this join type is good.UNIQUE, this is the best join type we can get. In summary, if the join type is listed as “ALL” and you aren’t trying to select most of the rows in the table, then MySQL is doing a full table scan which is usually very bad. You can fix this by adding more indexes. If you want more information, the MySQL manual covers this value with much more depth. - possible_keys - The name of the indexes that could possibly be used. This is where nicknaming your index helps. If you leave the name field blank, the name defaults to the name of the first column in the index (in this case, it would be “firstname”), which isn’t very descriptive.
- key - This shows the name of the index that MySQL actually uses. If this is empty (or
NULL), then MySQL isn’t using an index. - key_len - The length, in bytes, of the parts of the index being used. In this case, it’s 102 because firstname takes 50 bytes, lastname takes 50, and age takes 2. If MySQL were only using the firstname part of the index, this would be 50.
- ref - This shows the name of the columns (or the word “const”) that MySQL will use to select the rows. Here, MySQL references three constants to find the rows.
- rows - The number of rows MySQL thinks it has to go through before knowing it has the correct rows. Obviously, one is the best you can get.
- Extra - There are many different options here, most of which will have an adverse effect on the query. In this case, MySQL is simply reminding us that it used the
WHEREclause to limit the results.