Python Data Persistence – Relational Database
The term ‘database’ refers to an organized collection of data so as to remove redundancy and inconsistency, and to ensure data integrity. Over the years, different database models have been in use. The early days of computing observed the use of hierarchical and network database models. Soon, they were replaced by the relational database model, which is still used very predominantly. The last 10-15 years have seen the emergence of NoSQL databases like MongoDB and Cassandra.
The relational database model, proposed by Edgar Codd in 1970, aims to arrange data according to the entities. Each entity is represented by a table (called a relation). You can think of the entity as a class. Just as a class, an entity is characterized by attributes (also called fields, in the database terminology) that form columns of the table. Each instance of the entity is described in subsequent rows, below the heading row. The entity table structure provides one attribute whose value is unique for each row. Such an attribute is called the ‘primary key’.
If we analyze the pricelist example above, it involves three entities, Customers, Products, and Invoices. We have prepared three tables representing them, as fol\o\\;s:(figure7.1)
- Data Persistence – Python – PyMongo
- Python Data Persistence – Python – Cassandra
- Python Data Persistence – RDBMS Concepts
The important aspect of relational database design is to establish a relationship between tables. In the three tables above, the attributes ‘prodlD’, ‘CustID’, and ‘InvNo’ are primary keys in products, customers, and invoices tables respectively.
Further, the structure of the ‘invoices’ table uses ‘CustID’ and ‘ProductID’ attributes which are the primary keys of the other two tables. When the primary key of one table appears in the structure of other tables, it is called ‘Foreign key’ and this forms the basis of the relationship between the two.
This approach of database design has two distinct advantages. Firstly, using the relationship between primary and foreign keys, details of the corresponding row can be fetched without repetition. For example, the ‘invoices’ table has the ‘ProdlD’ foreign key which is the primary key in the ‘Products’ table, hence the ‘name’ and ‘price’ attributes can be fetched using this relationship. The same is true about ‘CustID’ which appears as the foreign key in ‘invoices’ and is the primary key in the ‘customers’ table. We can thus reconstruct the original price list table by using relationships.
Inv No | CustID | Customers.CustName | ProdID | Products.Proname | Product.Price | Qty | Total |
1 | 1 | Ravikumar | 1 | Laptop | 25000 | 2 | 50000 |
2 | 2 | John | 2 | TV | 40000 | 1 | 40000 |
3 | 3 | Divya | 1 | Laptop | 25000 | 1 | 25000 |
4 | 3 | Divya | 3 | Mobile | 15000 | 3 | 45000 |
5 | 2 | John | 3 | Mobile | 15000 | 2 | 30000 |
6 | 1 | Ravikumar | 2 | TV | 40000 | 1 | 40000 |
Secondly, you need not make any changes in the ‘invoices’ table, if either name of product or price changes, change in the ‘Products’ table will automatically reflect in all rows of invoices table because of the primary- foreign key relationship. Also, the database engine won’t allow deleting a certain row in the customers or products table, if its primary key is being used as foreign keys in the invoices table. This ensures data integrity.
Software products based on this relational model are popularly called Relational Database Systems (RDBMS). Some of the renowned RDBMS brands are Oracle, MySQL, MS SQL Server, Postgre SQL, DB2. SQLite, etc.