The basic data storing can be done through spreadsheet but when one needs to store and maintain a large amount of data, the database is needed. And when we talk about database, what immediately follows is the term ‘DBMS’ and the ‘RDBMS’. So what is DBMS and what is RDBMS? How are they related and what is the difference between DBMS and RDBMS? The DBMS, Data Base Management System, is nothing but a stand alone software application which stores your data in data files, for efficient management of the data. The data is properly structured and well indexed for easy manipulation in DBMS. The software applications like Microsoft Access, Oracle, MySQL, Microsoft SQL Server, Filemaker, falls under DBMS category. These software applications helps user to search, modify, save and retrieve any data from its database. The DBMS is further structured into three major categories; hierarchical, network, and relational.
The hierarchical databases are more like a tree structure which is designed like a pyramid and has branches spreading wide. The example can be of an organization structure which has many departments, sub-departments under its wing. The network database approach is very similar to hierarchical databases but looks more like cobweb in its structure, for instance, a sales department can have address, telephone no. of sales employees which can be accessed by the HR department and by the top management to get the record of each employee whereabouts. It has many owners/parents unlike hierarchical databases. Now dealing with these kind of structure, requires a lot of work in advance and there is limit to access the records. The scalability options to operate a huge database were found to be missing in the above categories of database, which lead to the introduction of RDMS i.e. Relational Database Management System.
This category of database was invented by Edgar Frank “Ted” Codd who wanted a relation to be established between the tables of the database. He also made thirteen rules to follow while dealing with the RDBMS. The RDMS is the most widely used in World Wide Web as there can be infinite numbers of tables’ and each table holding its own distinct information and at the same time talking to one another (Relation). The data in a RDBMS connects the tables, by using a common data element that is followed between the tables.
What is Normalization?
The relational databases comprises of indexes, key elements, and tables. The rows of the tables are called ‘Tuples’. The normalization technique is adopted when it comes to designing of data tables. The term ‘Normalization’ refers to certain methods to be followed for organization of data in the database and when these standards are met through your designed data tables; they are called ‘Data normalization’. This technique helps in removal of unnecessary data’s and a cleaner organization of data and its tables. The first normal form emphasizes on the need to organize the larger data into smaller logical units in the form of ‘tables’, provide one unique identity to a column in each row (Primary key) and to avoid any multiplicity in values (no repetitive values). For example, you have customer purchase record, which would have columns such as customer id, customer name, items bought and the cost of the item. Now the ‘products bought’ and the ‘price of the products’ could have multiple entries in the column field in case the customer has purchased more than one item and which would actually clutter the data. In order to avoid this, the data should be divided into two tables namely ‘Customer details’ and ‘Customer purchase record’. The first table could have customer id, customer name and the second table could have items bought, customer id and the cost of the item.
The second normal form reiterates on meeting of the requirements of the first normal form and also states that each non-key column should be related to primary key.
Item Purchase Table
|Item ID (Primary key)||Customer Name||Items purchased||Price paid||Customer address|
|001||ABC||Furniture||5,000||121, south city, GGN|
Now, in the above table, the item id is the primary key and the item purchased and price paid are part of the item id. But the customer address is associated more with the customer than the item id, so we can break this data into two tables.
|Customer Name (Primary key)||Customer address|
|ABC||121, south city, GGN|
Item Purchase Table
|Item ID (Primary key)||Customer Name||Items purchased||Price paid|
This approach would ease out the modification of address if required without disturbing the item purchase table. However in case one has to obtain the information of the address on which the item has been delivered, then one has to do the ‘JOIN’ operation.
The third normal form comes in picture only when the first and the second normal form requirements are fulfilled and the third normal form focuses on elimination of the columns that are not supportive to the primary key.
Item Purchase Table
|Item ID (Primary key)||Customer ID||Customer Name||Items purchased||Price paid||Customer address|
|001||8888||ABC||Furniture||5,000||121, south city, GGN|
The above table has customer ID, customer name and the customer address which can be moved as a separate table.
|Customer ID (Primary key)||Customer Name||Customer address|
|8888||ABC||121, south city, GGN|
Referential integrity between the tables
The principle that relational database holds, is that each table would consist of a primary key. Take for instance, ‘customer id ‘ which would be unique & can be set as primary key for ‘Customer Details’ table. Now a link can be established when the same customer id data is used in Customer purchase record table and is set as foreign key. This connection leads to referential integrity between the tables which enforce the relationship between the tables to remain constant i.e. No record should be added/deleted in the purchase table unless there is a corresponding addition or deletion is made in the customer details table as the ‘email id’ acts as a common element in the connected tables. This kind of relationship logic saves a lot of time and the data tables can be efficiently managed.
The SQL (Sequential Query Language) is the standard language followed for the management of relational databases. The Firebirds, PostgreSQL, My SQL, are the open source RDBMS. The vendors to these databases follow the ANSI ’95 (American National Standards Institute) SQL standards.
The difference between DBMS and RDBMS in tabular form
|Database Management System||DBMS with referential integrity between the tables|
|No normalization technique||Normalization technique adopted|
|Inclusion of flat file data in its system||Non-acceptance of flat file design|
|Used for simple business applications||Huge database applications|
|Foreign key support||Relationship established through foreign key only|
|Supported Languages: Programming Language, Data Manipulation Language, Data Definition Language, Schema Description Language, Sub-Schema Description Language||Only SQL (Sequential Query Language) supported|
|Structured into three major categories: network, hierarchical and relational||Follows relational model only|
|No relation between tables||Relation between the tables can be established|