Tuesday, November 25, 2008

Scaling MySQL for dummies

Alvinator's LOG: November 26, 2008 4:30am

I know it seems unhealthy to stay up awake at these times. I just finished my work but I can't manage to sleep anymore so I thought about writing a short blog.

MySQL has been my favorite RDBMS since I started working with databases. It's free, fast and easy to use although I know a lot of people arguing about its lack for features vs Oracle and postgresql specially on scaling. Well there's this mysql-cluster that promises High-Availabity and Performance, but I still find it pretty complex to setup and maintain. What you can do is put something on top of MySQL and you can do that with your own elbow-grease. There's a lot of techniques in doing this and these three are those that I've used so far:

1. Replication
You can replicate MySQL database (n) times, then configure your database applications to issue the SELECT commands on the replicas and INSERT/UPDATE/DELETE on the master database. This kind of setup is good for websites where most of the database operations is SELECT.

Advantages: Fast Reads
Disadvantages: Only one Master Database for writes (one point of failure)
Problems that may arise: Slave data might lag due to connectivity problems with the master rendering your SELECTS outdated at some point in time.

2. Partitioning
A table that has multi-million rows will drag your queries. One solution is to partition your table into multiple tables on a specific category like dates, year, location, etc. You can do this programatically yourself or you can make use of MySQL 5.x's built-in partitioning support. I've used it when parsing logs into database.

Advantages: Fast Reads
Disadvantages: queries into multiple tables is quite hairy
Problems that may arise: You're still using one MySQL server instance and that makes it one point of failure.

3. Sharding
I first heard about this when I was watching the youtube guys in google tech talk giving some information about their experiences scaling youtube. Sharding in a nutshell is where you partition your database across different MySQL servers and across different machines. A data can be stored in a shard (a MySQL database instance) using a specific foreign key like name, date, or a programmatic hash. I've personally implemented this kind of setup and I'm quite satisfied with it. Just one note: This is NOT the holy grail of MySQL Scalability, your specific requirements may not be met on this kind of setup.

Advantages: Fast Read/Writes, Better HA since you have multiple MySQL instances
Disadvantages: You have to implement it yourself, your database applications will need to connect to multiple databases.
Problem that may arise: Resharding is a nightmare, your system admins will hate you because manual queries will be harder, a lookup that is not using the foreign key will require you to query across all shards.

In sharding you have to decide how you can spread your data across shards and know how to easily obtain them. You use for example the first letter in username ('A' goes to shard1, 'B' goes to shard2, ...) which is human understandable although there's probably more A's than X's, Y's, and Z's right? A better solution is by programmatic hash. Using a programmatic hash ensures that your data is equally scattered among your shards. In python you can do it like this:

Python 2.5.1 (r251:54863, May 2 2007, 16:56:35)
[GCC 4.1.2 (Ubuntu 4.1.2-0ubuntu4)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> def hash_me(username, numshards):
... return hash(username) % numshards
>>> hash_me("alvinator", 4)

That makes "alvinator" assigned to shard number 3 if you have 4 shards.

Monday, November 10, 2008

Smart Uzzap launch

It's been almost six months since I last posted in this blog. The main reason of this long absence is because of this:

Our main project Smart Uzzap has been launched last July 7, 2008 and it looks that it will have a bright future ahead. The total user base still keeps growing to the point that our public chat rooms can barely keep up to the number of people that would like to socialize.

Like all systems, we did experienced a lot of birth pains and problems. We even encountered OS level issues that was pretty alien to us (TCP Zero Window, TCP half-open scenarios, weird behavior of mobile GGSN's, fuzzy network congestions, etc.). Although amidst of these stressful situations, I'm quite greatful for being a part of it since very few people do encounter and learn how to fix these problems.

Being a python developer, I'm very proud to say that the Smart Uzzap server and its other components is written in python. While currently on production, it still keeps evolving silently in the background. Even though I get tired of attending meetings, fixing bugs, and introducing new features; I feel good knowing that people start building their own community with Uzzap. There has been some eyeballs, friendster accounts, blogs, and youtube clips like one below that was created by Uzzap users

Here's the things that I've learned since the project launch:

1. Every system has its own issues so don't panic.
2. Resource Monitoring and Reports is a must but at the same time, you should be smart enough to interpret what they say to track down bottlenecks.
3. Do not take TCP for granted!!! You should configure your TCP settings according to how you utilize the server.
4. A low load average doesn't mean that the network traffic is low.
5. Server applications should be smart enough to reconfigure itself during runtime.
6. Avoid, the one-point of failure systems. Always provide a backup or have each component scalable.
7. Do the network programmer loop:
Relax -> Look for bottlenecks -> Fix bottlenecks -> Test the fixes on development setup -> Apply Fixes -> go back to step 1.
8. Sharding databases is quite nice as long as you only have 1 foreign key to each of the database servers.
9. Sometimes, a measly one-liner of code can greatly improve the system. ^_^

As for home projects, I'm too tired to horse around in my cave anymore although I'm planning to try my hands on some streaming stuff like the RFC 2326 RTSP protocol. So there we go, if anyone needs some advise on scalability, don't be shy send me an email and I'll try my best to answer.