There are many types of databases present and there is a lot of confusion(for beginners) to use which one in which cases. To add to the confusion there are columnar databases. In this article, we will look at what is the difference between row-based and columnar databases and where we can use columnar databases.
Difference between columnar and row-based databases
The main difference between these two is the way they save the data on the disk. In row-based databases, each row is stored in a disk as consecutive disk spaces
.
In the case of a columnar database, each column is stored in consecutive locations on disk
.
Now, let’s take a look what are advantages and disadvantages of these.
In terms of write
Writes will be easier in row-based databases
,
why? because there will be less seek operation and just writes will be there on the disk, because all the locations to write is the next locations in the disk.
Writes will be more latent in the columnar database
since it has to find the new location of the row and the offset at which it can write the next data of the row.
In terms of reads
Read depends on what you want to read. If your use case of row-wise reads, then certainly rows-wise databases is the choice but if you want to read whole data of anyone column, in that scenario columnar databases are awesome to use.
Now you must be thinking in what scenario we will read the whole column data, well these use cases come in data analytics with big data.
Look at the given problem.
Given data of users as name, age and address, find the average age, draw the five-point plot of the ages. If you look at these use cases this is the one that matches our case. So where ever you see this kind of use case you can use the columnar databases.
Few columnar databases: Cassandra, Redshift
If you like the article please share and subscribe.