Database is an organized collection of data. To make operations like definition, querying, update, and administration of database, the special designed software application called Database Management System (DBMS) is necessary. DBMS helps the user to capture and analyze data. It’s classified by database model, for example the most famous model called relational model as the data model is relational. The database model is used to determine the logical structure of database and which manner data can be stored, analyzed and manipulated.
Relational database model is based on first-order predict logic which data is represented by tuples and grouped by relations.
However, when data is not structured and relational, relational DBMS is not capable to manage such kind of data. But relational database model is not good at adapting the change. Because of the various data formats such as hierarchies, cubes, linked-lists and unstructured data, it’s not capable to organize data into tables.
As a solution, NoSQL (not only SQL) comes up. NoSQL database management systems enable data to be stored in a variety of formats like key-value store, column store, graph store and document store. NoSQL called not only SQL is to emphasize that SQL-like query languages may also be supported. But it does not guarantee the true ACID (atomicity, consistency, integrity, and durability) principle. NoSQL database management systems remove hard constraints, such as tabular row store and strict data definition, and have distributed architectures to support high performance throughput. NoSQL databases are widely used in big data and real-time web applications.
Relational databases are the most popular and widely used databases. The data model organizes data as tables or relations. Each table consists of rows and columns which is illustrated in figure1. Each row has an unique key.
Figure 1. An example of relational database model
Different from NoSQL databases, the data model of relational databases is fixed and the data is structured. It supports transaction management and guarantees true ACID principle. Relational database management system (RDBMS) which is based on relational model has been developed for several decades and still dominates current database market. It’s widely deployed in banks, schools, hospitals, governments and so on due to its properties.
Key-value store database
Key-value store is one of the most simple database management system. Data is stored by key and value illustrated in figure2, and can be retrieved when the key is known so that the complex querying and management functionality of RDBMS is not needed.
A string can represent the key and the actual data can be represented by value. The data can be any kind of data types in programming language such as string, integer, array and so on, or an abstract object which bindings to the key. The data model is flexible so that the requirement for the formatted data is less strict.
Figure 2. An example of Key-Value database model
Compared to common SQL databases, it contains the advantage of fast speed in storing and retrieving data. This happens when relations, correlations or collations of data are not necessary. An SQL table can be organized into two columns, a key and a value. In this case, for querying, just find the value and return it. This is very fast.
Specifically speaking, SQL language has great advantages of dealing with structured data and allows highly dynamic queries. However, for current web applications, it’s another case. It’s an object oriented way of thinking such as the back-end database of MVC (Model-View-Control) pattern, instead of a highly dynamic range of queries which are full of outer and inner joins, unions and complex calculations over large tables. Meanwhile, it will result in complex hierarchies of tables if relational data models are transferred into object oriented models because of large amounts of normalization. For key-value store databases, the data model is schema less and an object can be just represented by a value with a key to identify the object. Therefore, the storage of arbitrary data indexed using a single key to allow retrieval is allowed. That’s why key-vale store is also called simple store.
The code tends to look clean and simple compared with embedded SQL strings in the programming language. As for object-relational mapping frameworks, a lot of complex code between an SQL database and an object-oriented programming language will be added.
Document store databases
Data is stored in the document store databases with the data format such as XML, PDF, JSON etc. The document contains a unique key “ID” to identify a document explicitly and a collection of documents. The example of the data model is illustrated in figure3. Documents of document store databases are similar to records in relational databases. The difference is that the data model in document oriented databases is more flexible as its property of schema-less. New documents, no matter which kinds of attribute are contained, can be stored as adding new attributes in existing documents at runtime.
Figure 3. The left figure is an example of document format of JSON and the right figure is an example of document format of XML.
Unlike relational databases whose records inside the same database have same data fields, document databases have the property that document may have similar as well as dissimilar data. The data model of document databases is slightly more complex than that of key-value databases, which instead of key-value store, the data model of document store can be represented as key-document pairs. If the database has a lot of relations and normalization, it’s not appropriate to use document database. Instead document stores are used for content management system, blog software etc.
Column store databases
Column store databases support the standard relational logical data model. Databases consist of a collection of tables and each table has a named collection of attributes which are columns instead of rows for relational data model. Attributes can form a unique primary key or foreign key referring to another primary key in another table.
The most different point of the two kinds of databases is that the data model of the relational databases is row oriented, however, on the contrast, the data model of column store databases are column oriented. Figure4 can illustrate the difference simply and clearly.
Figure 4. The upper figure illustrates the row- oriented data model and the lower figure illustrates the column-oriented data model.
From figure4, in a row-oriented database management system, the data would be stored as “1, John Smith, 19; 2, Jim Green, 18; 3, Lucy King, 16; 4, Freda Ford, 15”. Whereas in a column-oriented database management system, the data would be store as “1, 2, 3, 4; John Smith, Jim Green, Lucy King, Freda Ford; 19, 18 ,16, 15”.
The column store databases store the data with the way to be aggregated rapidly with less I/O activities and offer high scalability in data storage. They are efficient in applications including customer relationship management (CRM) systems, electronic library car catalogs, data warehousing and other ad-hoc query systems.
Graph store databases
Graph store databases are common used to handle relationships due to their efficient management of heavily linked data. (Neo4j) The data model of graph contains nodes representing entities which hold proper types and numbers of properties like key-value pairs. Figure5 is an example of the graph data model. The connection between two nodes are revealed by directed, named semantic relationships. The relationships also have properties such as know, own, like etc. Two nodes can have not only one number or type of relationships , but many if there are. As relationships are stored efficiently, this won’t sacrifice performance.
Figure 5. An example of graph data model. The ellipses represent nodes. Each node is a data entity with types and values. The arrows represent connections with relationships and their properties.