When To Use Indexes In MySQL

Want to support HowtoForge? Become a subscriber!
 
Submitted by tsmonaghan (Contact Author) (Forums) on Thu, 2010-04-08 11:47. :: MySQL

When To Use Indexes In MySQL

This comes up in discussions almost every new project I work on, because it's a very important thing to consider when designing a database.

When deciding when and how to create an index in your MySQL database, it's important to consider how the data is being used.

Let's say you have a database of employees. We will create it like this:

CREATE TABLE employees (
ID INT,
name VARCHAR(60),
salary decimal(10,2),
date hired(date)
)

So you will notice that this table is pretty simplistic, and doesn't really contain all the info you would need to actually manage employees, but its just for the sake of demonstration, and you could always add more later, or even make another table and use joins if you had really complex needs.

For now we will go over these real quick.

The ID is basically just a number (INT) which can hold a very large number. If this were real world I would probably make it unsigned, since you will never have a negative employee ID – but either way, you will never reach the number of employees it would take to get to the number that would fill up an INT.

Even unsigned int will hold values up to 2,147,483,647. So if you have 2 billion employees, you would probably not be a developer anymore ;-).

You might want to consider making the field an auto increment, and primary key, the auto increment depending on how data will be entered into this database.

Name is a simple varchar(60) which should cover most people's names.

Salary is a decimal with 10 total digits, two on the right hand side of the decimal point. This would handle a salary of up to 99,999,999.99 – again, you're not likely to hit this limit.

Date hired will be a date in this format 2010-05-06. YYYY-MM-DD.

So when considering this simple table, where would you expect to need an index?
If we assign ID as a primary key, we don't need one there.

Indexes are best used on columns that are frequently used in where clauses, and in any kind of sorting, such as "order by".

You should also pay attention to whether or not this information will change frequently, because it will slow down your updates and inserts. Since you wont frequently be adding employees, you don't have to worry about the inserts.

Let's say that you will be looking up the employees with a php web interface and the end user will be typing in the employees name to find them, since remembering the employee ID's would be cumbersome.

It sounds like this situation would be good to use an index.

A – You won't be updating the employee's name very often, so you don't have to worry about a performance hit there.

B – You WILL be using the employee in where clauses like this:

select * from employees where name ='smith';

C – You WILL be generating reports, which will probably be alphabetic, like this:

select * from employees order by name asc;

So in this simple example it's easy to see when it would be important to use indexes.

So, you could do it like this:

create index name_index on employees (name);

You might be working on a more complex database, so it's good to remember a few simple rules.

- Indexes slow down inserts and updates, so you want to use them carefully on columns that are FREQUENTLY updated.

- Indexes speed up where clauses and order by.

Remember to think about HOW your data is going to be used when building your tables.

There are a few other things to remember. If your table is very small, i.e., only a few employees, it's worse to use an index than to leave it out and just let it do a table scan. Indexes really only come in handy with tables that have a lot of rows.

So, if Joe’s Pet Shop was using this database, they would probably be able to leave the index off the "name" column.

If Microsoft was using this database (hah!) they might want to throw and index in there.

Another thing to remember, that is a con in the situation of our employees database, is that if the column is a variable length, indexes (as well as most of MySQL) perform much less efficiently.

As you can see there are many things to consider with indexes, even with a very simple table as this.

I would suggest looking at the explain command in MySQL, which I will be writing about in the future.

http://www.tsmonaghan.com


Please do not use the comment function to ask for help! If you need help, please use our forum.
Comments will be published after administrator approval.
Submitted by Sreepal V (not registered) on Thu, 2014-03-06 07:12.
super!
Submitted by A. Dzulfikar Ad... (not registered) on Tue, 2013-09-24 03:41.
I'm searching an article about index in MySQL to optimize my apps query speed. And it's really helpful. Easy to understand
Submitted by ajay gadhavana (not registered) on Wed, 2013-08-21 05:39.
Really very good explanation of indexing
Submitted by gripped (not registered) on Sat, 2013-03-30 15:11.
Precise and good explanation!
Submitted by Michael (not registered) on Fri, 2012-04-06 22:02.

very good article, short and clean

thank you

Submitted by soundarapandian (not registered) on Mon, 2011-06-13 14:22.
This is very useful for me understanding about places where we need index.Thank You :)
Submitted by Anonymous (not registered) on Thu, 2011-02-10 13:30.
Since the speed is relative, I think it would be more comprehensible if the context in which a study was done is specified. Saying that a table is big or small is too vague
Submitted by Mohammad Naji (not registered) on Sat, 2010-10-16 21:39.
I'd always loved the way you publish your articles. Good luck my dear friend.
Submitted by Ahmad Hamadallah (not registered) on Mon, 2010-04-12 21:54.

Thank you for your article.

However, Indexing is only one factor to speed up read, sorting and filtering operation. There are many variables should be included in mind when configuring MySql start up variables like:

key_buffer,  table_cache,  sort_buffer_size, read_buffer_size,  query_cache_limit and query_cache_size.

 Regarding slow down update and insert incase of many indexes:

The slowness is not humanbeeing notacable unless you have loop of insert or update.

Thank you.

Submitted by dinotrac (not registered) on Fri, 2010-04-09 18:52.

Indexes absolutely slow down inserts because of the overhead required to update the index, but an update might not be hurt.

Updates that don't change indexes won't incur that extra overhead.

An update is actually a combination of a read and a write.  If the read uses the index and the update doesn't change it, the update might even be faster instead of slower.


Submitted by Bradley C. Kuszmaul (not registered) on Thu, 2010-04-08 13:52.

It is true that for MyISAM or InnoDB, maintaining indexes can be expensive.  But that's not true for all storage engines, or all situations.

For example, if your employees database fits in main memory then maintaining an index is cheap.

Even if your database doesn't fit in memory, there are ways to maintain indexes. For example, Tokutek's storage engine, TokuDB, can maintain indexes in the face of high insertion or update rates.  (Hundreds of thousands of updates or inserts per second on stock hardware (a single magnetic rotating disk, a multicore processor and not much memory.)


 

Submitted by SeanJA (not registered) on Wed, 2010-04-14 23:12.
Convenient that you work for them no?