Since relational databases(oracle, Mysql) are SQL/table oriented so the alternative databases named NoSQL(initially abbreviated as no sql later it became not only SQL) databases provides a mechanism for storage and retrieval of data that is modelled in means other than the tabular relations used in relational databases. NoSQL broadly cover to all non-relational databases.
NoSQL databases follows BASE(Basically Available, Soft state, Eventual consistency) properties instead of ACID(Atomicity, Consistency, Isolation,Durability) properties. There is trade-off of preferring BASE over ACID when we are dealing with large volume of data and shifting gear from transaction processing to dealing with real time application or analytic operations. NoSQL database systems has flexible data model, higher scalability, and superior performance, however most of these NoSQL databases also discard the very foundation that has made relational databases so useful for generations of applications – expressive query language, secondary indexes and strong consistency. Refer this article for detailed explanation about BASE vs ACID.
Motivation behind inclination towards nosql databases can be summarized as :
1. Need to handle multi-structured data types(structured,semi-structured, unstructured and
polymorphic data) or scale beyond the capacity constraints of existing systems. NoSQL
support "horizontal" scaling to clusters of machine and used extensively in big
data and real-time web applications.
2. Desire to identify viable alternatives to expensive proprietary database software and hardware.
Commodity hardware can be used to build database ecosystem instead monolithic server and
3. Agility or speed of software development.
There are long list of databases which comes under category of NoSQL ecosystem. Based on the data structure internally these database uses, it can be broadly classified in three category, it is called data model in NoSQL terminology.
Document storing data as key value pairs
- Graph model - Graph databases use graph data structure to store data. It is being used when traversing relationships are core to the application like social network connections.
Neo4j and Giraph are commonly known graph database.Below diagram shows how graph model is created.
Image courtesy: wikipedia
- key-value model :- key-value stores are the most basic type of non-relational database. It is schema less database and every item in the database is stored as an attribute name/key together with its value. It is commonly used for representing unstructured data. Data is considered to be inherently opaque to the database because based on that key data is accesses whatever it is. Following diagram represent typical key-value store database. Example of Key/Value model database : Riak and Redis
- Wide column model(Columnar NO SQL database ) :- Wide column model is similar to traditional relational database(consisting of table and columns) with one most significant difference : number of columns is not fixed for each record.
i.e : columns are created for each row rather than being predefined by the table structure.
Data retrieval only supported using primary key per column family
Example of Wide Column model database : HBase and Cassandra
Below diagram summarise various data model and corresponding databases with strength and weakness.(Diagram courtesy: Edureka)
- Document databases provide the ability to query on any field within a document.
- The document data model is the most natural and most productive because it maps directly to objects in modern object-oriented languages.
- Key-value stores and wide column stores provide a single means of accessing data: by primary key.
- Document databases and graph databases can be consistent or eventually consistent. MongoDB provide tunable consistency.By default, data is consistent — all writes and reads access the primary copy of the data. However, read operation against secondary copies.
- Eventually consistent systems provide some advantages for inserts at the cost of making reads, updates and deletes more complex, In eventually consistent systems, there is a period of time in which all copies of the data are not synchronized.