Search This Blog

Monday, October 17, 2011

SQL Vs NoSQL


The advantage of a relational database is the ability to relate and index information. Most key-value systems don't provide that.
Does switching to nosql really make sense for the intended use case?
You have kind of missed the point. The point is, you don't have an index. You don't have a centralized list of records, or the ability to relate it together in any easy way. What makes nosql key-value stores so quick is that you store and retrieve what you need in a name-based approach. You need that blurb on someone's profile page? Just go fetch it. No need to maintain a table with everything in it. This being said, NoSQL has a number of novel structure which make many usecases trivially easy, e.g. Redis is a data-structure oriented DB well-suited to rapidly building anything with queues. MongoDB is a freeform document database which stores documents as JSON.
Not everything really needs to be tabular.
There's advantages and disadvantages. Sometimes using a mix of both can also make sense. SQL for most, and something along the lines of CouchDB for random things that have no need to be clogging up an SQL table.
You can liken a key-value system to making an SQL table with two columns, a unique key and a value. This is quite fast. You have no need to do any relations or correlations or collation of data. Just find the value and return it. This is an oversimplification, NoSQL databases do have a lot of nifty functions beyond simple K,V stores.
You'll find a simple K,V store is also fast in SQL databases. I've used it in place of actual key-value systems before NoSQL databases matured a bit.
I do not think scientific data is well suited to a nosql implementation, but if you look at HBase, it may well suit a scientist's needs.
The efficiency comes from the following areas:
1. The database has far fewer functions: there is no concept of a join and lessened or absent transactional integrity requirements. Less function means less work means faster, on the server side at least.
2. Another design principle is that the data store lives in a cloud of servers so your request may have multiple respondents. These systems also claim the multi-server system improves fault tolerance through replication.

No comments: