Tag Archives: mysql

MySQL and Index Woes

I have often noticed that altering indexes on large tables in MySQL is a very time consuming process. I’d chalked it up to “just how things are” and account for that when doing database migrations. Possibly because I’d usually *add* an index on the table not drop it.

But apparently there there is a design flaw in MySQL 5.0 and before with respect to how it handles indexes that caused dropping indexes on tables very slow, and also adding a new index on the table slower than it needed to be. Gory details here: http://bugs.mysql.com/bug.php?id=2364

But in 5.1, at least dropping table ought to be near instantaneous. Adding an index? Umm… it seemed the old behaviour was:

Given the same MyISAM table T having four indexes (ndx1,ndx2,ndx3,ndx4)
and you want to ‘alter table T add index ndx5 (…);’ here is exactly
what happens (from: http://lists.mysql.com/mysql/202489)

1) MySQL copies T.MYD to a temp table, i.e., S.MYD and a zero byte S.MYI.
2) MySQL does ‘alter table S add index ndx1 (…);
3) MySQL does ‘alter table S add index ndx2 (…);
4) MySQL does ‘alter table S add index ndx3 (…);
5) MySQL does ‘alter table S add index ndx4 (…);
6) MySQL does ‘alter table S add index ndx5 (…);
7) MySQL deletes T.MYD and deletes T.MYI
8) MySQL renames S.MYD to T.MYD, and renames S.MYI to T.MYI

Note that each of the “alter table” step involves copying the table to a temp table with no indexes, applying all existing indexes one by one, and then applying the new index…. so in the above case:

In fact, for table T with no indexes and you want to add N indexes
MySQL will copy the MYD N times
MySQL will copy the MYI N times
MySQL will run ‘alter table add index’ N(N+1)/2 times if adding an index
MySQL will run ‘alter table drop index’ N(N-1)/2 times if dropping an index

So here is a better way to add indexes to a table:

1) create table T1 like T;
This creates an empty table T1 with indexes ndx1,ndx2,ndx3 and ndx4.
2) alter table T1 drop index ndx3;
This drops index ndx3 on the empty T1, which should be instantaneous.
3) insert into T1 select * from T;
This will populate table T and load all three(3) indexes for T1 in one pass.
4) drop table table T;
5) alter table T1 rename to T;


Monitoring, knowledge and teamwork

Jeremy Zawodny, MySQL guru, tells a story about:

One of the frustrating things about building and running backend infrastructure services is that things break sometimes — often in unexpected ways. And when they do, you can spend untold hours chasing down various ideas, most of which are dead ends, before eventually solving the problem.

He starts with a curious MySQL login failure and ends up with IRQs and boot time kernel options.


Read the story here: part I, part 2.

Things to get right about MySQL from day #1

There is an excellent article in Linux Magazine titled “MySQL Performance from the Start”. Its by Jeremy Jawodny, so it comes with some serious authority backing it. He talks about some mistakes people make when developing the typical database backed web app and how keeping some things correct from the start will save oodles of pain later. I confess, this touched a nerve. I have experienced all of these mistakes and the pain that comes along with them.

Here is what Jeremy says, in a nutshell:

  1. What’s MySQL There For? MySQL is there for stuff that you do relational queries on. Don’t store BLOBs (images, videos, excel sheets, word docs, etc.) in the db; store metadata about them, for sure, but store the binary data that you can’t really query in a distributed file system. And don’t log to the database, either. Logs aren’t usually queried often by the front-end and tend to grow unbounded, slowing down the database and adding to admin’s headaches when they become too big.
  2. Plan for a caching layer between the app and the DB; memcached is a good solution, but there are more. You can do this, for e.g., by putting all DB access code in a library initially when you don’t have caching; later, when it is time to add memcached, you only have one place to modify.
  3. OLTP vs OLAP. Front end database is transactional in nature: a user session reads/updates many different tables but using some index (the user ID, for e.g.). Then there is reporting. It typically wants to look at all data (“Count users who did XXX in last YYY days”). From day 1, have separate front-end and reporting databases with different schemas, the formed optimized for indexed queries and perhaps denormalized a little to optimize SELECTs, and the latter tuned for aggregate queries with complex joins. If you don’t have a reporting db yet, at least don’t convolute the front-end database with reporting-like schema: keep it lean. Also see the next poi

Here I want to note down what I was thinking when I made these mistakes, so I can present them as arguments (for myself and others) when someone is about to make the same mistake again.

Storing BLOBs in the DB

When I started out, I did a little research on the performance of retrieving images from the db vs the file-system. I ended up choosing the DB because:

  • Storing images in the DB automatically gave me a distributed repository; at that time I was not aware of any good distributed file-systems, and not having another component in the stack meant there was one less thing to manage
  • I thought that there was no significant performance impact of storing BLOBs in the db

Boy, was I wrong.

First, storing BLOBs in db meant that some tables have become VERY large in physical size. It is a pain to query those tables, and back up and trim the db, now. If I have to remove old, unused BLOBs now, the DELETE query takes very long and is threatening to bring the DB to its knees. If I had only kept the metadata in the DB and stored the files in the file-system, I could have very quickly removed just the metadata from the db, freeing it up for its normal duties, and removed the files themselves at leisure. So, all in all, its worth spending the time to find, implement and maintain the distribtued file-system solution in the long run.

Secondly, I can’t take advantage of the highly optimized file-serving capabilities of web servers. When I started out, I was serving out everything from scripts: dynamic data as well as static images. When the load was low, things were fine. When the load went up, I painfully realized that more than 50% of the hits to the app were for static images that could have been served off the file-system by the web-server itself.

Thirdly, the table storing BLOBS should have stored just the BLOB and an identifier, nothing else. You don’t want to make InnoDB crawl through kilobytes of binary data looking for that 4-byte integer you stashed away there. Thankfully, this is a mistake we did not make. Or, rather, cannot make. Because by now we know that keeping anything in that table that a SELECT mught use in a JOIN or WHERE clause will make for a query that will never finish and bring the db down to boot.

Storing Logs in the DB

We needed reports on how often certain pages were being hit and certain actions triggered. We could have logged it to the filesystem and written a separate script to grok the logs, create summaries, and bundle it off to the reporting db. But that was too much work. It seemed just so easy to simply stuff raw log data into the front-end db. We would, we reasoned, one day write a script that would grok the table, create summaries, bundle it off to the reporting db, and trim the log table. (Notice that this is slightly more work.)

Jeremy is right: you never quite get around to write that “trim table” cron job and the log table just keeps growing. The management report is generated off that table directly. One day, it will finally become too slow.

Caching Layer

We do have a caching layer. The front-end code makes heavy use of memcached, and all db access is through functions in one or two files. Guess how this happened? Yeah, because of a major code-refactoring and speed-up sprint when we realized the servers were about to go kaput.

Multiple Languages, ORMs and Caching

There is another problem: memcached is good for storing objects, not tuples or records. But if you have programs in multiple languages, and they all use different ORMs, or worse, no ORM at all, then the utility of memcached becomes limited to one part of the system. Component A uses and cached some data, but component B, which needs the same data cannot use the cache because its repesentation of the object is different.

I don’t know if there is a lesson here. One might be tempted to say “use one language and ORM for everything”, but then one is also tempted to say “use the best language/library/framework for the task at hand”.

Multi-language systems are a reality. The best I can think of is to keep the number of different languages and frameworks small. And use the same ORM for all components written in a particular language.


The big problem with not planning for a separate reporting db from day 1 is that

  • you keep adding raw logs to the front-end db and they never get trimmed;
  • you are afraid of deleting historical information from the db; you don’t know what kind of report you might want in the future, so you just keep all the information there, indefinitetely.

Better to:

  • Create a reporting db and reporting system from day 1, with a very simple report that sumarizes some information;
  • Write tools to create summary data for the report and trim the front-end db
  • Have an “archive” db separate from the front end db; write tools to move historical stuff to the archive db and remove it from the main db

Make it clear to management that:

  • Any report they request that needs new summary information (that you are not already generating) will not be able to report stuff from the past;
  • Or, if they want arbitrary reports in the future, you will have to keep ALL data around, and they better be ready to issue checks to buy bigger and bigger servers for the DB.

MySQL ERROR 1005: Can’t create table

I am experiencing the weirdest behaviour with MySQL. I can’t seem to be able to create a particular table.


mysql> create table `devices` (id bigint(20) unsigned not null auto_increment primary key) engine=InnoDb default character set=utf8 collate=utf8_bin ;
ERROR 1005 (HY000): Can't create table './couffable_global/devices.frm' (errno: 150)

However, if I change the table name to `foo`, it works:

mysql> create table foo (id bigint(20) unsigned not null auto_increment primary key) engine=InnoDb default character set=utf8 collate=utf8_bin ;
Query OK, 0 rows affected (0.00 sec)

And no, the devices table does not already exist.

I was doing this in an already created database, as part of a data migration exercise. I already had a database with a table called devices. I was uploading an SQL dump to it. The dump would first DROP TABLE IF EXISTS `devices` and then CREATE TABLE `devices` ( ....) . This would not work.

If I start from a clean database, this would work. So I am assuming the presence of some other table is causing this.