Monitoring, knowledge and teamwork
September 14, 2009
Jeremy Zawodny, MySQL guru, tells a story about:
One of the frustrating things about building and running backend infrastructure services is that things break sometimes — often in unexpected ways. And when they do, you can spend untold hours chasing down various ideas, most of which are dead ends, before eventually solving the problem.
He starts with a curious MySQL login failure and ends up with IRQs and boot time kernel options.
Scary.
Python to Cython and benchmarking
July 24, 2009
Recently I have been coding with Cython for my project PyCAF. Obviously, I am doing this to make my code run faster. My approach was to first write my code in Python and make sure it works correctly. I wrote unit-tests for each function and class, using nosetests --with-coverage to look for test cases I might have missed. Then I profiled the code, finding out hot-spots, and eliminating the obvious issues.
Next up was to write some performance micro-benchmarks for important bits of code. These benchmarks have the same interface: they take an “input size” N as a parameter. What N means is specific to the function. For e.g., for the function event_one_one(N) it means “create N pairs of tasklets, and make each pair exchange one event” but for the function event_one_many(N) it means “create one sender tasklet and N recipient tasklets waiting for the sender to send an event”. You get the drift. The micro-benchmarks are tested as well, by the simple expedient of writing a unit-test case per micro-benchmark that calls it with a small input.
Digression: I spent some time looking for some tool where I could store results of my benchmarks per code commit and then later browse how a particular benchmark varied over code commits, but found nothing…. does anyone have any ideas? If not, I might write a tool for this in the future.
Anyway, the benefit of micro-benchmarks is that you can see how the performance scales when the input sizes grow. For e.g., here is the output of my benchmark tool:
| Test Name | N | Ratio | Time (s) | Ratio | K-PyStones |
|---|---|---|---|---|---|
| tasklet_new | 100 | 1 | 0.000518083572388 | 1 | 0.0364847586189 |
| tasklet_new | 1000 | 10 | 0.00419187545776 | 8.09111826967 | 0.295202497026 |
| tasklet_new | 10000 | 10 | 0.0460090637207 | 10.9757706746 | 3.24007490991 |
| tasklet_new | 100000 | 10 | 0.516650915146 | 11.2293290357 | 36.3838672638 |
| tasklet_yield | 100 | 1 | 0.000921964645386 | 1 | 0.0649270877032 |
Some things to note here: I convert the time taken by a test to kilo-pystones and record that as well as the time taken. What’s a Pystone? Well, its the output of python -c "from test import pystone; pystone.main()". For my machine:
Pystone(1.1) time for 50000 passes = 0.7
This machine benchmarks at 71428.6 pystones/second
So basically pystones is a (somewhat) machine-independent measure of how long a test took to run. While a test runs faster on a fast machine and slower on a slow machine, when you convert the time to pystones, it should be the same.
Now the interesting thing to note is how the time for a test increases when the input grows larges. To help me, my benchmark prints the ratio between successive input sizes and times taken. If I increase the input size by, say 10 times, and the time taken increases 100 times, then I might have a problem. Of course, behind every test are a bunch of functions and algorithms. I have a general expectation of what the complexity of my function should be, and look at the benchmark to confirm my expectations. I can then replace algorithms with high complexity with better ones. A good design and a well chosen algorithm or data structure gives more speedup than mindlessly twiddling around with code.
Digression: My benchmark is not reporting the space-complexity of the tests (aka, how much memory is being consumed). Tarek Ziade mentions some tools in his excellent book “Expert Python Programming”, but I have not understood the details of the tools well enough to incorporate them into my testing, just yet.
Once I find a benchmark that I want to improve, I will first profile it, to find where the function or algorithm is spending most time. This is obvious and there are enough resources on the web about it (just look for “cProfile”). What I was wondering about was how would I profile code that I had converted to Cython, since Cythonized code becomes a binary? Well, the Cython wiki gives the answer: http://wiki.cython.org/Profiling. So thats what I will be looking into very soon.
Things to get right about MySQL from day #1
July 14, 2009
There is an excellent article in Linux Magazine titled “MySQL Performance from the Start”. Its by Jeremy Jawodny, so it comes with some serious authority backing it. He talks about some mistakes people make when developing the typical database backed web app and how keeping some things correct from the start will save oodles of pain later. I confess, this touched a nerve. I have experienced all of these mistakes and the pain that comes along with them.
Here is what Jeremy says, in a nutshell:
- What’s MySQL There For? MySQL is there for stuff that you do relational queries on. Don’t store BLOBs (images, videos, excel sheets, word docs, etc.) in the db; store metadata about them, for sure, but store the binary data that you can’t really query in a distributed file system. And don’t log to the database, either. Logs aren’t usually queried often by the front-end and tend to grow unbounded, slowing down the database and adding to admin’s headaches when they become too big.
- Plan for a caching layer between the app and the DB; memcached is a good solution, but there are more. You can do this, for e.g., by putting all DB access code in a library initially when you don’t have caching; later, when it is time to add memcached, you only have one place to modify.
- OLTP vs OLAP. Front end database is transactional in nature: a user session reads/updates many different tables but using some index (the user ID, for e.g.). Then there is reporting. It typically wants to look at all data (“Count users who did XXX in last YYY days”). From day 1, have separate front-end and reporting databases with different schemas, the formed optimized for indexed queries and perhaps denormalized a little to optimize SELECTs, and the latter tuned for aggregate queries with complex joins. If you don’t have a reporting db yet, at least don’t convolute the front-end database with reporting-like schema: keep it lean. Also see the next poi
Here I want to note down what I was thinking when I made these mistakes, so I can present them as arguments (for myself and others) when someone is about to make the same mistake again.
Storing BLOBs in the DB
When I started out, I did a little research on the performance of retrieving images from the db vs the file-system. I ended up choosing the DB because:
- Storing images in the DB automatically gave me a distributed repository; at that time I was not aware of any good distributed file-systems, and not having another component in the stack meant there was one less thing to manage
- I thought that there was no significant performance impact of storing BLOBs in the db
Boy, was I wrong.
First, storing BLOBs in db meant that some tables have become VERY large in physical size. It is a pain to query those tables, and back up and trim the db, now. If I have to remove old, unused BLOBs now, the DELETE query takes very long and is threatening to bring the DB to its knees. If I had only kept the metadata in the DB and stored the files in the file-system, I could have very quickly removed just the metadata from the db, freeing it up for its normal duties, and removed the files themselves at leisure. So, all in all, its worth spending the time to find, implement and maintain the distribtued file-system solution in the long run.
Secondly, I can’t take advantage of the highly optimized file-serving capabilities of web servers. When I started out, I was serving out everything from scripts: dynamic data as well as static images. When the load was low, things were fine. When the load went up, I painfully realized that more than 50% of the hits to the app were for static images that could have been served off the file-system by the web-server itself.
Thirdly, the table storing BLOBS should have stored just the BLOB and an identifier, nothing else. You don’t want to make InnoDB crawl through kilobytes of binary data looking for that 4-byte integer you stashed away there. Thankfully, this is a mistake we did not make. Or, rather, cannot make. Because by now we know that keeping anything in that table that a SELECT mught use in a JOIN or WHERE clause will make for a query that will never finish and bring the db down to boot.
Storing Logs in the DB
We needed reports on how often certain pages were being hit and certain actions triggered. We could have logged it to the filesystem and written a separate script to grok the logs, create summaries, and bundle it off to the reporting db. But that was too much work. It seemed just so easy to simply stuff raw log data into the front-end db. We would, we reasoned, one day write a script that would grok the table, create summaries, bundle it off to the reporting db, and trim the log table. (Notice that this is slightly more work.)
Jeremy is right: you never quite get around to write that “trim table” cron job and the log table just keeps growing. The management report is generated off that table directly. One day, it will finally become too slow.
Caching Layer
We do have a caching layer. The front-end code makes heavy use of memcached, and all db access is through functions in one or two files. Guess how this happened? Yeah, because of a major code-refactoring and speed-up sprint when we realized the servers were about to go kaput.
Multiple Languages, ORMs and Caching
There is another problem: memcached is good for storing objects, not tuples or records. But if you have programs in multiple languages, and they all use different ORMs, or worse, no ORM at all, then the utility of memcached becomes limited to one part of the system. Component A uses and cached some data, but component B, which needs the same data cannot use the cache because its repesentation of the object is different.
I don’t know if there is a lesson here. One might be tempted to say “use one language and ORM for everything”, but then one is also tempted to say “use the best language/library/framework for the task at hand”.
Multi-language systems are a reality. The best I can think of is to keep the number of different languages and frameworks small. And use the same ORM for all components written in a particular language.
OLTP vs OLAP
The big problem with not planning for a separate reporting db from day 1 is that
- you keep adding raw logs to the front-end db and they never get trimmed;
- you are afraid of deleting historical information from the db; you don’t know what kind of report you might want in the future, so you just keep all the information there, indefinitetely.
Better to:
- Create a reporting db and reporting system from day 1, with a very simple report that sumarizes some information;
- Write tools to create summary data for the report and trim the front-end db
- Have an “archive” db separate from the front end db; write tools to move historical stuff to the archive db and remove it from the main db
Make it clear to management that:
- Any report they request that needs new summary information (that you are not already generating) will not be able to report stuff from the past;
- Or, if they want arbitrary reports in the future, you will have to keep ALL data around, and they better be ready to issue checks to buy bigger and bigger servers for the DB.
MySQL ERROR 1005: Can’t create table
June 1, 2009
I am experiencing the weirdest behaviour with MySQL. I can’t seem to be able to create a particular table.
E.g.:
mysql> create table `devices` (id bigint(20) unsigned not null auto_increment primary key) engine=InnoDb default character set=utf8 collate=utf8_bin ;
ERROR 1005 (HY000): Can't create table './couffable_global/devices.frm' (errno: 150)
However, if I change the table name to `foo`, it works:
mysql> create table foo (id bigint(20) unsigned not null auto_increment primary key) engine=InnoDb default character set=utf8 collate=utf8_bin ;
Query OK, 0 rows affected (0.00 sec)
And no, the devices table does not already exist.
I was doing this in an already created database, as part of a data migration exercise. I already had a database with a table called devices. I was uploading an SQL dump to it. The dump would first DROP TABLE IF EXISTS `devices` and then CREATE TABLE `devices` ( ....) . This would not work.
If I start from a clean database, this would work. So I am assuming the presence of some other table is causing this.
Dawkins says the Pope is “stupid, ignorant or wicked”
April 3, 2009
From the Telegraph.co.uk via Pharyngula
Professor Dawkins, the prominent biologist and atheist, said that Benedict XVI would have blood on his hands if his beliefs were followed by Catholics around the continent.
Speaking at a university in Spain, he said: “I wonder on what basis anyone can say condoms make Aids worse. The Pope is either stupid, ignorant or dim.
Later on, Dawkins corrected the above report, saying what he’d actually said was:
The Pope is either stupid, ignorant or wicked.
Indians still stuck on white is beautiful?
March 22, 2009
In the Lucknow round of Miss India 2009:
Beauty queens are encouraged to think of themselves as role models so it was easy to ask Manisha what she would do when she was one. What would she advise people who were short or dark? Very seriously she replied, “Not everyone can be beautiful but they should try.” Manisha clearly equated short and dark with ugliness.
When will we look on *all* Indians as equal claimants to the right to be called beautiful? I don’t think anyone in Lucknow would every allow for a person with Mongoloid features (as some tribals from the north east of India) to be called beautiful without qualifying it in some sense.
Vatican backs abortion row bishop
March 15, 2009
From BBC (http://news.bbc.co.uk/2/hi/americas/7930380.stm):
Vatican backs abortion row bishop
A senior Vatican cleric has defended the excommunication in Brazil of the mother and doctors of a young girl who had an abortion with their help. The nine-year-old had conceived twins after alleged abuse by her stepfather. Cardinal Giovanni Battista Re told Italian paper La Stampa that the twins “had the right to live” and attacks on Brazil’s Catholic Church were unfair. It comes a day after Brazil’s president criticised the Brazilian archbishop who excommunicated the people involved. Brazil only permits abortions in cases of rape or health risks to the mother. Doctors said the girl’s case met both these conditions, but the Archbishop of Olinda and Recife, Jose Cardoso Sobrinho said the law of God was above any human law. He said the excommunication would apply to the child’s mother and the doctors, but not to the girl because of her age.
So, the Roman Catholic Church in Brazil and the Vatican thinks that the foetuses have a greater right to live than the young girl. Another lunatic stance, and another proof that when you disagree with religious loons on their views, far from showing the “love” they preach, they will come after you with the biggest stick they can find.
SQLAlchemy: one classes – two tables
January 18, 2009
SQLAlchemy is a wonderful ORM for Python. While it allows the normal “class per table” semantics, one can do some more interesting stuff with it.
We often start with a table for some stuff. For e.g., we might start off our app with a User class mapping to a user table, which has columns for an id, name and an address. For illustration, the list of columns is small. In a real application, this table would probably contain a lot more columns.
Here is some boilerplate code that initiates our database connection, etc.:
from sqlalchemy import *
from sqlalchemy import sql
from sqlalchemy.orm import mapper, sessionmaker, create_session
dburl = "mysql://testuser:testpasswd@localhost/test"
engine = create_engine(dburl)
meta = MetaData(bind=engine)
Let’s define a table and a class mapping to the table.
t_users = Table(
'users', meta,
Column('id', Integer, primary_key=True),
Column('name', String(40), nullable=False),
Column('address', String(200))
)
class User(object):
def __init__(self, name, address=None):
self.name = name
self.address = address
def __repr__(self):
return "<User(%s,%s,%s)>" % (self.id, self.name, self.address)
mapper(User, t_users)
To create the database, we might do something like the below, or roll out own database schema management system:
meta.create_all()
We store and retrieve User objects from the database like this:
# create a session
session = create_session(bind=engine)
# for testing only... ensure table is empty to start with
session.execute(t_users.delete())
# create and storea record
u = User('Beowulf', 'Denmark')
session.save(u)
session.flush()
# retrieve records from the db
session.clear()
u = session.query(User).filter_by(name='Beowulf').one()
print u
The output of the above snippet would be:
<User(2,Beowulf,Denmark)>
It might make sense in the beginning to put everything related to a user in one table. After a while, as the application grows, one starts seeing some patterns. Imagine that the address field above was very large (or imagine that there was a ‘photo’ field containing the user’s photograph). However, the address field is used only in one corner of the application, and only the name is used in the vast majority of the application code. What’s more, the large binary/string field is slowing down other queries. One of the ways to solve this problem is to split the user table into two:
t_users = Table(
'users', meta,
Column('id', Integer, primary_key=True),
Column('name', String(40), nullable=False),
)
t_addresses = Table(
'addresses', meta,
Column('user_id', Integer,
ForeignKey('users.id'), primary_key=True),
Column('address', String(255)),
)
meta.create_all()
Now, whereever we are interested in only the user’s name, we’d use a User class. Where-ever we are interested in the address, we’d use an address class.
class User(object):
pass
class Address(object):
pass
mapper(User, t_users)
mapper(Address, t_addresses, properties={
'user': relation(User, backref='address',
primaryjoin=t_users.c.id==t_addresses.c.user_id)
})
session = create_session(bind=engine, transactional=False)
session.execute(t_addresses.delete())
session.execute(t_users.delete())
u1 = User(); u1.name = "Hagar"
session.save(u1)
a1 = Address() ; a1.user = u1 ; a1.address = "Denmark"
session.save(a1)
session.flush()
Now, we are presumably saving a lot of time and memory on our queries. But, what if we want the equivalent of the original User object, one which refers to both the name and address? Well, we can define a class mapped to a join and use it:
usersaddresses = sql.join(t_users, t_addresses,
t_users.c.id == t_addresses.c.user_id)
class UserAddress(object):
def __repr__(self):
return "<FullUser(%s,%s,%s)" % (self.id, self.name, self.address)
mapper(UserAddress, usersaddresses, properties={
'id': [t_users.c.id, t_addresses.c.user_id],
})
f = session.query(UserAddress).filter_by(name='Hagar').one()
print f
Note the “id” column in the “properties”: we’ve told SQLAlchemy that the “id” attribute of our UserAddress class actually is the same as the “users.id” or “addresses.user_id” columns, which are always the same. Thus SQLAlchemy will NOT produce redundant “id” and “user_id” attributes in our class.
We can even use this class to change and save attributes, and the attributes will go to their respective table!
f.name = "Hagar the horrible"
f.address = "Copenhagen"
session.flush()
for (id, name, user_id, address) in session.execute(usersaddresses.select()):
print id, name, user_id, address
We get:
11 Hagar the horrible 11 Copenhagen
Python packaging: custom scripts
October 13, 2008
In the post Python Packaging: setuptools and eggs, I described how to use setuptools to create a distributable egg. Installing the egg would provide:
- a set of python packages and modules, usable as a library
- a set of “scripts”: small programs that live in the user’s bin directory, where ever that is; these, for the most part invoke functions within the packages
While setuptools can generate these scripts for us, sometimes that does not cut it. Maybe we really do want to create scripts ourselves and get them installed in the user’s `bin` directory.
Writing scripts yourself
In the last post we got setuptools to generate three scripts for us: rundog, rungendibal, runbjarne. Let us hand-craft a script that does the work of any of these, depending on a command line argument. Let’s call our script ‘runany’: it will take one argument—”dog”, “gendibal”, or “bjarne”—and run the appropriate script for the argument. Here is our script:
# scripts/runany
#!python
import sys, os
which=sys.argv[1]
os.execlp("run" + which)
And we tell setuptools (actually, the underlying distutils) about it:
# setup.py
setup(...
scripts=['scripts/runany'],
...
)
Now, we can build our egg. When the user installs the egg, he would have the following scripts available:
- rundog, rungendibal, runbjarne: these are auto-generated by setuptools
- runay: this is a wrapper script that setup’s Pythons path and calls our own script above.
The user can run our runany script like this:
$ runany dog
Bow, wow!
$ runany bjarne
Hello, C++ World!
Wrapper code
Let’s see what setuptools did. First, it put a runany script in the user’s path. What’s inside?
$ cat `which runany`
#!/home/parijat/.virtualenvs/test/bin/python
# EASY-INSTALL-SCRIPT: 'Speaker==0.2dev','runany'
__requires__ = 'Speaker==0.2dev'
import pkg_resources
pkg_resources.run_script('Speaker==0.2dev', 'runany')
If you were expecting to see the code we wrote above and are surprised, join the club. This script uses the pkg_resources.run_script to actually call our code. The rest is just ensuring that the proper version of the egg is being referenced.
Why does it do so? So that the user can have multiple versions of our package installed and be able to use them.
Where is our script anyway? It is here:
$ ls /lib/python2.5/site-packages/Speaker-0.2dev-py2.5-linux-x86_64.egg/EGG-INFO/scripts/
runany
where install_dir refers to where ever the user wanted our egg to be installed.
In conclusion, we wrote some custom code, told setuptools where to find it (using `scripts=[...]` directive in setup.py), created an egg, installed it, and everything works.
Non-Python scripts
You might have noticed above that the custom script we wrote was in Python. This is not an accident: the distutils/setuptools directive `scripts=…` that we used to get our stuff packaged and installed expects to be given python scripts. Actually, distutils can handle non-Python scripts: it will copy them verbatim without wrapping them or adjusting them in anyway. However, setuptools always wraps scripts in a way that non-Python scripts are not supported. This can be considered a bug in setuptools. So, when using setuptools, non-Python scripts are out. This is not so bad: why would you want to write a shell or some other script for doing something that can be done in Python?
Non-package Data Files
What if you wanted an init-script to go into `/etc/init.d/’, or some config file to go into ‘/etc/’? setuptools won’t do that for you either. Setuptools will keep your files bundled inside the egg. But you can provide the user with a script that will extract and copy the files to arbitrary locations, on their discretion. See http://peak.telecommunity.com/DevCenter/setuptools#non-package-data-files.
Unkind programmers…
October 9, 2008
The other day, while going through a Twisted tutorial, I stumbled upon this:
user, status = line.split(':', 1)
user = user.strip()
status = status.strip()
Heh, we programmers don’t take the user seriously, do we?