Wednesday, 26 March 2014

Benefits of a schema-less database

A typical argument for choosing a NoSql database (such as a document store or graph database) over a relational database (such as MySql) is that NoSql databases are schemaless and that makes you more agile.

So, what does this actually mean? What are the concrete benefits of being schema-less and how does it actually make you more agile as a developer or as an organization?

In this context we use the term agile not strictly as in Agile software development but more in term of just being faster, more flexible, nimble and adaptable to change.

Here are some of the real world reasons why schemaless makes you more agile:
  • When starting a new development project you don't need to spend the same amount of time on up-front design of the schema (which is likely to change several times anyway). Less need for planning -> more agile.  
  • Once the NoSql is up and running the developer does not need to know about the database. No need to learn SQL or database specific stuff and tools. Less to know about. Less to learn for new developers -> faster development -> more agile.
  • The rigid schema of a relational database (RDBMS) means you have to absolutely follow the schema. It can be harder to push data into the DB as it has to perfectly fit the schema. Being able to add data directly without having to tweak it to match the schema can save you time-> more agile.
  • Minor changes to the model and you will have to change both your code and the schema in the DBMS. No schema -> don't have to make changes in two places -> Less time consuming -> more agile.
  • With a NoSql DB you have fewer ways to pull the data out -> fewer ways to shoot yourself in the foot -> less chance of being bogged down in problems later (e.g. as unforeseen intricate future queries starts to manifest themselves as performance issues) -> more agile.
  • Schema-less -> Less overhead for DB engine -> better performance and scalability -> Less overhead for developers related to scalability -> Fewer devopment obstacles -> more agile. 
  • Changes to the DB schema will often break the solution for other developers and they have to keep updating the DB schema locally even if they are working on something completely unrelated. Syncing DB and code as you pull from your source code repository requires extra work and sometime even extra time trying to figure out why your solution don’t work after Git pull. No schema -> Fewer problems with code/schema not being in sync -> More agile development.
  • ALTER TABLE operations during deployment can be very heavy on the system and will usually result in downtime. Schema changes in general will break the system when code and schema is not in sync during deployment. No schema -> less downtime -> deploy whenever you like-> more agile.
  • Save time when the day comes when you have to denormalize tables because of performance issues. Less overhead related to schema -> more agile.
  • You don't have to deal with artificial link-tables (for many to many relations) because it is built into graph databases. -> fewer things to deal with -> easier to modify code/"schema" -> faster development -> more agile.
  • Eliminates the need for Database administrators or database experts -> fewer people involved and less waiting on experts -> more agile.
  • Less time needed to pick the right datatypes for new database columns -> save time -> more agile.
  • Schema-less -> better DB engine performance -> less chance of having to create separate systems to offload the DB because of heavy queries from big reports etc. -> less complicated system -> more agile. 
  • Save time writing complex SQL joins -> more rapid development -> more agile.
  • Rolling back code doesn't necessarily break the App, you might just not get the data you expect. Easier to deploy/rollback without serious consequences -> more agile.
  • As more and more applications/systems use your RDBMS it is harder to change the schema because of dependencies. Change the schema and you might f* up some other system. With NoSql schema changes other systems might not get the expected data but the chance of runtime errors is less. No schema -> less worry about messing up for other systems -> more agile.
In this blog post I have covered only the benefits of schema-free. There are also drawbacks but for the sake of brevity it is not included here. Being super agile might not be the top prioritization for all organizations.