Cassandra/HBase or just MySQL: Potential problems doing the next thing
Say I have "user". It's the key. And I need to keep "user count". I am planning to have record with key "user" and value "0" to "9999+ ;-)" (as many as I'll have).
What problems I will drive in if I use Cassandra, HBase or MySQL for that? Say, I have thousand of new updates to this "user" key, where I need to increment the value. Am I in trouble? Locked for writes? Any other way of doing that?
Why this is done -- there will be a lot of "user"-like keys. Different other cases. But the idea is the same. Why keep it this way -- because I'll have more reads, so I can always get "counted value" very fast.
I would just update the user count as a batch operation every N minutes rather than updating it in realtime. If there's only one process updating it, you don't need to worry about contention by definition.
Alternatively cassandra has a contrib/mutex for adding lock support via ZooKeeper.
MongoDB has update-in-place and a special inc operator for counters. http://blog.mongodb.org/post/171353301/using-mongodb-for-real-time-analytics
MongoDB and HBase have this built-in (as do most other databases which guarantee consistency).
One fairly simple trick with Cassandra is to have a particular row for usercount, and then insert an unique ID (e.g. a random UUID) column name with an empty value to it every single time an user is added. With regular intervals count the number of columns and put them in a total counter - removing the columns you've just counted.
At any time, your total user count is therefore [total counter]+[number of columns on your usercount row]. You can get these with essentially two reads, and if you have row cache enabled, it's going to be fast.
HBase has an incrementColumnValue method for a fast, atomic read/write operation.