Python Data Persistence – Overriding

Python Data Persistence – Overriding

With two similar radii for the circle, the equation of the area of the ellipse (π*r1 *r2) turns out to be 7rr2, and the equation of perimeter of the ellipse (2π effectively becomes 27πr. However, we would like to redefine area( ) and perimeter( ) methods, which are inherited by circle class, to implement these specific formulae. Such redefinition of inherited methods is called method overriding.
Modify circle class as per following code:

Example

class circle(ellipse):
        def___init___(self, r1, r2=None):
                super().__init___(r1,r2)
                self.radius2=self.radiusl
        def area(self) :
                 area=math.pi*pow(self.radius1,2)
                  return area
       def perimeter (self) :
           perimeter=2*math.pi*self.radius1
           return perimeter

The result will be unaffected though. Python class can be inherited from multiple classes by putting names of more than one class in parentheses of the class definition.

Protected?

As mentioned earlier, Python doesn’t believe in restricting access to attributes. To emulate the behavior of Java-like protected access specifier, a single underscore character is prefixed to the instance attribute. However, for all practical purposes, it behaves like a public attribute, not even requiring name mangling syntax as in the case of the ‘private’ attribute (prefixed by double underscores). The _ character serves merely as a deterrent expecting responsible programmer to refrain from using it outside inherited class.

Python Data Persistence – Inheritance

Python Data Persistence – Inheritance

Of all features of object-oriented programming methodology, inheritance is arguably the most important, and hence the most used feature in software development. Leaving aside its literary meaning, inheritance in the context of programming refers to the mechanism by which features of one class (attributes and methods) are made available to another class that has the option to define additional resources or modify the functionality of inherited methods.

Python makes extensive use of inheritance. As we have seen, each class is inherited from the built-in object class. Inheritance enables a template definition of properties in a parent class to be implemented by its subclass for a more specific purpose.

One such case in point is the relationship between built-in classes ‘int’ and ‘bool’. Inheritance comes into the picture whenever there is ‘IS A’ kind of relationship between classes. Here a ‘bool’ object IS A(n) ‘int’ right? Hence ‘int’ class is a superclass of the ‘bool’ class. As you know a bool object has only two possible values True and False. They are equivalent to 1 and 0 respectively which are integers by the way.

Let us establish the inheritance of two customized classes with the following example. Ellipse an elongated circle with two radii and one radius is larger than the other. The circle can be seen as a more specific case of ellipse where two radii are equal. So circle IS an A(n) ellipse! We are going to define an ellipse class with two instance attributes and two methods area( ) and perimeter( ).

Example

#inheritEllipse.py
import math
class ellipse:
      def__init__(self, r1, r2):
            self.radius1=r1
            self.radius2=r2
     def area(self):
           area=math.pi*self.radiusl*self.radius2
             return area
def perimeter(self):
            perimeter=2 *math.pi*math. 
sqrt((pow(self.radius1,2)+pow(self.radius2,2))/2)
           return perimeter

 

Note that formula for area of ellipse = π*rl *r2 and perimeter of ellipse =2π

Let us import this class in the interpreter and declare objects as follows:

Example

>>> from inheritEllipse import ellipse
>>> e1=ellipse (20,30)
>>> e1.area()
1884.9555921538758
>>> e1.perimeter()
160.19042244414092
>>> e1=ellipse(20,20)
>>> e1.area()
1256.6370614359173
>>> e1.perimeter()
125.66370614359172

Note that in the second case both radii are equal, hence the ellipse happens to be a circle. We now design a circle class using the ellipse class as its parent. Add following lines in inheritEllipse .py code.

Example

class circle(ellipse) :
           def__init___(self, r1, r2=None):
                    super ( ) .__init__(r1 , r2)
                             self.radius2=self.radius1

Just import this inherited circle class, declare an object with a radius of 20. The result shows that the perimeter and area match that of the ellipse object with identical radii.

Example

>>> from inheritEllipse import ellipse, circle
>>> c1=circle(20)
>>> c1.area()
1256.6370614359173
>>> c1.perimeter()
125.66370614359172
> > >

 

Python Data Persistence – Class Level Attributes and Methods

Python Data Persistence – Class Level Attributes and Methods

In the above example, MyClass defines two data variables __myname and __myage that are instance attributes. They are invariably initialized through __init__( ) constructor. Their values are different for each object. A class however may have an attribute such that its value is the same for all existing objects. In other words, such attribute is a shared or common resource and defined outside the __init__( ) method, for that matter outside any instance method.

In the following script, the total is a class variable of a player class. It is defined with the purpose of maintaining a running sum of runs scored by each player. The player class defines a name and runs as instance attributes initialized through __init__( ) method as usual which also keeps on adding runs of each object.

Example

#classattr.py
class player:
      __total=0
      def__init__(self, name, runs):
            self.__name=name
            self.___runs=runs
            player.__total+=self.__runs
            print ('Total runs so far:',player.___total)

Let us import the player class and set up a few objects. Following interpreter, activity shows that the ___total variable is being cumulatively updated by ___runs of each object.

Example

>>> from classattr import player
>>> p1=player('Virat', 60)
Total runs so far: 60
>>> p2=player('Rahul', 45)
Total runs so far: 105

Two things are to be noted here. First, the use of the += operator. It is an in¬place addition operator effectively assigning the addition of two operands back to the left operand. Hence player. total==+=self. runs actually becomes player. total=player. total+self.runs . In-place variations of other operators defined in Python are +=, -=, *=, /=, and so on.

Secondly, the value of __total is retrieved with the help of the name of the class (player.__total) rather than self. This is obvious because the total is a class variable and not an instance variable specific to any particular object.
In view of this feature, Python has a provision to define methods that can access such class attributes. A class method needs the name of the class to be passed to it as an argument (conventionally using ‘cls’ identifier). The class can also have a static method that doesn’t need an explicit reference to either class or object which means there’s no argument to it in the form of self or els.

Class method and static method is decorated by built-in @classmethod and @statiemethod directives.

Example

#classattr.py
class player:
       ___total = 0
       def__init__(self, name, runs):
               self.___name=name
               self.___runs=runs
               player.___total+=self.___runs
               print ('Total runs so far:',player.___total)
      @classmethod
      def printtotal(cls):
          print ('Total runs so far:',els.___total)
      @staticmethod
      def displaytotal( ):
                print ('Total runs so far:',player.__total)

Output

>>> from classattr import player 
>>> p1=player('Virat',60)
Total runs so far: 60 
>>> p2=player('Rahul',45)
Total runs so far: 105 
>>> player.printtotal( )
Total runs so far: 105 
>>> player.displaytotal( )
Total runs so far: 105

Python Data Persistence – Relational Database

Python Data Persistence – Relational Database

The term ‘database’ refers to an organized collection of data so as to remove redundancy and inconsistency, and to ensure data integrity. Over the years, different database models have been in use. The early days of computing observed the use of hierarchical and network database models. Soon, they were replaced by the relational database model, which is still used very predominantly. The last 10-15 years have seen the emergence of NoSQL databases like MongoDB and Cassandra.

The relational database model, proposed by Edgar Codd in 1970, aims to arrange data according to the entities. Each entity is represented by a table (called a relation). You can think of the entity as a class. Just as a class, an entity is characterized by attributes (also called fields, in the database terminology) that form columns of the table. Each instance of the entity is described in subsequent rows, below the heading row. The entity table structure provides one attribute whose value is unique for each row. Such an attribute is called the ‘primary key’.

If we analyze the pricelist example above, it involves three entities, Customers, Products, and Invoices. We have prepared three tables representing them, as fol\o\\;s:(figure7.1)

Python Data Presistence - Relational Database chapter 7 img 1

The important aspect of relational database design is to establish a relationship between tables. In the three tables above, the attributes ‘prodlD’, ‘CustID’, and ‘InvNo’ are primary keys in products, customers, and invoices tables respectively.
Further, the structure of the ‘invoices’ table uses ‘CustID’ and ‘ProductID’ attributes which are the primary keys of the other two tables. When the primary key of one table appears in the structure of other tables, it is called ‘Foreign key’ and this forms the basis of the relationship between the two.

This approach of database design has two distinct advantages. Firstly, using the relationship between primary and foreign keys, details of the corresponding row can be fetched without repetition. For example, the ‘invoices’ table has the ‘ProdlD’ foreign key which is the primary key in the ‘Products’ table, hence the ‘name’ and ‘price’ attributes can be fetched using this relationship. The same is true about ‘CustID’ which appears as the foreign key in ‘invoices’ and is the primary key in the ‘customers’ table. We can thus reconstruct the original price list table by using relationships.

Inv No CustID Customers.CustName ProdID Products.Proname Product.Price Qty Total
1 1 Ravikumar 1 Laptop 25000 2 50000
2 2 John 2 TV 40000 1 40000
3 3 Divya 1 Laptop 25000 1 25000
4 3 Divya 3 Mobile 15000 3 45000
5 2 John 3 Mobile 15000 2 30000
6 1 Ravikumar 2 TV 40000 1 40000

Secondly, you need not make any changes in the ‘invoices’ table, if either name of product or price changes, change in the ‘Products’ table will automatically reflect in all rows of invoices table because of the primary- foreign key relationship. Also, the database engine won’t allow deleting a certain row in the customers or products table, if its primary key is being used as foreign keys in the invoices table. This ensures data integrity.

Software products based on this relational model are popularly called Relational Database Systems (RDBMS). Some of the renowned RDBMS brands are Oracle, MySQL, MS SQL Server, Postgre SQL, DB2. SQLite, etc.

Python Data Persistence – INSERT Statement

Python Data Persistence – INSERT Statement

Now that we have created tables in our database, let us add few records to them. SQL provides an INSERT statement for the purpose. Its standard syntax is as follows:

Example

INSERT INTO table name (coll, col2, ...) VALUES (vail, val2 , val3 , ...) ;

Name of the table in which a new record (row) is to be added, follows mandatory keywords INSERT INTO. The column list is given after the name in parentheses, which is followed by the VALUES clause. The data corresponding to each column is given in another set of parentheses. The following statement adds one record in the Products table:

sqlite> INSERT INTO Products (Name, Price) VALUES ('Laptop1, 40000);

We insert a row in the ‘Customers’ table by executing the following statement in the SQLite console:

sqlite> INSERT INTO Customers (Name, GSTIN) VALUES ('Ravikumar', 127AAJPL7103N1ZF');

Similarly, the following statement adds a record in the ‘Invoices’ table:

sqlite> INSERT INTO Invoices (CUSTID, PRODUCTID, Quantity) VALUES (1, 1, 2);

Note that, in the above INSERT statements, we have not included ProductID, CustID, and InvID columns in respective column list parentheses because they have been defined as autoincrement fields. The column list may be omitted altogether if you intend to provide values for all columns in the table (excluding autoincrement fields). They must be given in the VALUES list exactly in the same order in which their fields have been defined. You may add a few more records to these three tables. Sample data for these tables is given below: (table 7.3, table 7.4, and table 7.5)

Product ID Name Price
1 Laptop 25000
2 TV 40000
3 Router 2000
4 Scanner 5000
5 Printer 9000
6 Mobile 15000
CustID Name GSTIN
1 Ravikumar 27AAJPL7103N1ZF
2 Patel 24ASDFG1234N1ZN
3 Nitin 27AABBC7895N1ZT
4 Nair 32MMAF8963N1ZK
5 Shah 24BADEF2002N1ZB
6 Khurana 07KABCS1002N1Z V
7 Irfan 05IIAAV 5103N1ZA
8 Kiran 12PPSDF22431ZC
9 Divya 15ABCDE1101N1ZA
10 John 29AAEEC4258E1ZK
Inv ID CustID Product ID Quantity
1 1 1 2
2 10 2 1
3 9 6 3
4 4 1 6
5 10 5 3
6 2 2 5
7 2 1 4
8 5 3 10
9 7 5 2
10 3 4 3

 

 

Python Data Persistence – RDBMS Products

Python Data Persistence – RDBMS Products

Relational Software Inc. (now Oracle Corp) developed its first SQL-based RDBMS software called Oracle V2. IBM introduced System-R as its RDBMS product in 1974 and followed it by a very successful DB2 product.
Microsoft released SQL Server for Windows NT in 1994. Newer versions of MS SQL server are integrated with Microsoft’s .NET Framework.
SAP is an enterprise-level RDBMS product targeted towards UNIX-based systems being marketed as Enterprise Resource Planning (ERP) products.

An open-source RDBMS product named MySQL, developed by a Swedish company MySQL AB was later acquired by Sun Microsystems, which in turn has, now, been acquired by Oracle Corporation. Being an open-source product, MySQL is a highly popular choice, after Oracle.
MS Access, shipped with the Microsoft Office suite, is widely used in small-scale projects. The entire database is stored in a single file and, hence, is easily portable. It provides excellent GUI tools to design tables, queries, forms, and reports.

PostgreSQL is also an open-source object-oriented RDBMS, which has evolved from the Ingres project of the University of California, Berkley. It is available for use on diverse operating system platforms and SQL implementation is supposed to be closest to SQL standard.
SQLite is a very popular relational database used in a wide variety of applications. Unlike other databases like Oracle, MySQL, etc., SQLite is a transactional SQL database engine that is self-contained and serverless. As its official documentation describes, it is a self-contained, serverless, zero-configuration, transactional SQL database engine. The entire database is a single file that can be placed anywhere in the file system.

SQLite was developed by D. Richard Hipp in 2000. Its current version is 3.27.2. It is fully ACID compliant which ensures that transactions are atomic, consistent, isolated, and durable.
Because of its open-source nature, very small footprint, and zero-configuration, SQLite databases are popularly used in embedded devices, IoT, and mobile apps. Many web browsers and operating systems also use SQLite database for internal use. It is also used as prototyping and demo of larger enterprise RDBMS.

Despite being very lightweight, it is a full-featured SQL implementation with all the advanced capabilities. SQLite database can be interfaced with most of the mainstream languages like C/C++, Java, PHP, etc. Python’s standard library contains the sqlite3 module. It provides all the functionality for interfacing the Python program with the SQLite database.

7.4 SQLite installation

Installation of SQLite is simple and straightforward. It doesn’t need any elaborate installation. The entire application is a self-contained executable ‘sqlite3.exe’. Official website of SQLite, (https://sqIite.org/download.html) provides pre-compiled binaries for various operating system platforms containing the command line shell bundled with other utilities. All you have to do is download a zip’archive of SQLite command-line tools, unzip to a suitable location and invoke sqlite3.exe from DOS prompt by putting name of the database you want to open.
If already existing, the SqLite3 database engine will connect to it; otherwise, a new database will be created. If the name is omitted, an in memory transient database will open. Let us ask SQLite to open a new
mydatabase.sqlite3.

E:\SQLite>sqlite3 mydatabase.sqlite3
SQLite version 3.25.1 2018-09-18 20:20:44
Enter “.help” for usage hints.
sqlite>

In the command window a sqlite prompt appears before which any SQL query can be executed. In addition, there “dot commands” (beginning with a dot “.”) typically used to change the output format of queries, or to execute certain prepackaged query statements.
An existing database can also be opened using .open command.

E:\SQLite>sqlite3
SQLite version 3.25.1 2018-09-18 20:20:44
Enter “.help” for usage hints.
Connected to a transient in-memory database.
Use “.open FILENAME” to reopen on a persistent database.
sqlite> .open test.db

The first step is to create a table in the database. As mentioned above, we need to define its structure specifying name of the column and its data type.

SQUte Data Types

ANSI SQL defines generic data types, which are implemented by various RDBMS products with a few variations on their own. Most of the SQL database engines (Oracle, MySQL, SQL Server, etc.) use static typing. SQLite, on the other hand, uses a more general dynamic type system. Each value stored inSQLite database (or manipulated by the database engine) has one of the following storage classes:
• NULL
• INTEGER
• REAL ‘
• TEXT
• BLOB
A storage class is more general than a datatype. These storage classes are mapped to standard SQL data types. For example, INTEGER in SQLite has a type affinity with all integer types such as int, smallint, bigint, tinyint, etc. Similarly REAL in SQLite has a type affinity with float and double data type. Standard SQL data types such as varchar, char,nchar, etc. are equivalent to TEXT in SQLite.
SQL as a language consists of many declarative statements that perform various operations on databases and tables. These statements are popularly called queries. CREATE TABLE query defines table structure using the above data types.

CREATE TABLE

This statement is used to create a new table, specifying following details:
• Name of new table
• Names of columns (fields) in the desired table
• Type, width, and the default value of each column.
• Optional constraints on columns (PRIMARY KEY, NOT NULL, FOREIGN KEY)

Example

CREATE TABLE table_name (
column 1 datatype [width] [default] [constraint],
column2 ,
column3 …,
) ;

Python Data Persistence – Constraints

Python Data Persistence – Constraints

Constraints enforce restrictions on data that a column can contain. They help in maintaining the integrity and reliability of data in the table. Following clauses are used in the definition of one or more columns of a table to enforce constraints:

PRIMARY KEY: Only one column in a table can be defined to be a primary key. The value of this table will uniquely identify each row (a record) in the table. The primary key can be set to AUTOINCREMENT if its type is INTEGER. In that case, its value need not be manually filled.

NOT NULL: By default value for any column in a row can be left null. NOT NULL constraint ensures that while filling a new row in the table or updating an existing row, the contents of specified columns are not allowed to be null. In the above definition, to ensure that the ‘name’ column must have a certain value, NOT NULL constraint is applied to it.

FOREIGN KEY: This constraint is used to enforce the ‘exists’ relationship between two tables.
Let us create a Products table in ‘mydatabase’ that we created above. As shown in Figure 7.1, diagram, the ‘products’ table consists of ProductID, Name, and Price columns, with ProductlD as its primary key.

sqlite> CREATE TABLE Products (
. . . > ProductID INTEGER PRIMARY KEY
AUTOINCREMENT, 
...> Name TEXT (20),
...> Price INTEGER
. . . > ) ;

(Ensure that the SQL statement ends with a semi-colon. You may span one statement over multiple lines in the console)
We also create another ‘Customers’ table in the same database with CustlD and Name fields. The CustlD field should be defined as the primary key.

sqlite> CREATE TABLE Customers (
. . . > CustlD INTEGER PRIMARY KEY
AUTOINCREMENT, 
. . . > Name TEXT (20) ,
. . . > GSTIN TEXT (15)
. . . . > ) ;

Finally, we create another ‘Invoices’ table. As shown in the figure 7.1 diagram, this table has InvID as the primary key and two foreign key columns referring to ProductID in ‘Products’ table and CustlD in the ‘Customers’ table. The ‘Invoices’ table also contains the ‘price’ column.

sqlite> CREATE TABLE Invoices ( 
. . . > InvID INTEGER PRIMARY KEY
AUTOINCREMENT, 
. . . > CustlD INTEGER REFERENCES
Customers (CustlD), 
. . . > ProductID INTEGER REFERENCES
Products (ProductID), 
. . . > Quantity INTEGER (5)
. . . > ) ;

To confirm that our tables have been successfully created, use the .tables command:

sqlite> .tables
Customers Invoices Products

SQLite stores the schema of all databases in the SQL1TEJVIASTER table. We can fetch names of our databases and tables with the following command:

sqlite> SELECT * FROM s'qlite_master WHERE 
type='table';

To terminate the current session of SQLiteS activity use .quit command.

Python Data Persistence – Python – Cassandra

Python Data Persistence – Python – Cassandra

In this last chapter, we are going to deal with another important NOSQL database – Cassandra. Today some of the biggest IT giants (including FaceBook, Twitter, Cisco, and so on) use Cassandra because of its high scalability, consistency, and fault-tolerance. Cassandra is a distributed database from Apache Software Foundation. It is a wide column store database. A large amount of data is stored across many commodity servers which makes data highly available.

Cassandra Architecture

The fundamental unit of data storage is a node. A node is a single server in which data is stored in the form of the keyspace. For understanding, you can think of keyspace as a single database. Just as any server running a SQL engine can host multiple databases, a node can have many key spaces. Again, like in a SQL database, keyspace may have multiple column families which are similar to tables.

However, the architecture of Cassandra is logically as well as physically different from any SQL-oriented server (Oracle, MySQL, PostgreSQL, and so on). Cassandra is designed to be a foolproof database without a single point of failure. Hence, data in one node is replicated across a peer-to-peer network of nodes. The network is called a data center, and if required, multiple data centers are interconnected to form a cluster. Replication strategy and replication factor can be defined at the time of the creation of a keyspace. (figure 12.1) ‘

Python Data Presistence - Python - Cassandra chapter 12 img 1
Each ‘write’ operation over a keyspace is stored in Commit Log, which acts as a crash-recovery system. After recording here, data is stored in a Mem-table. Mem-table is just a cache or buffer in the memory. Data from the mem-table is periodically flushed in SSTables, which are physical disk files on the node.

Cassandra’s data model too is entirely different from a typical relational database. It is often, described as a column store or column-oriented NOSQL database. A keyspace holds one or more column families, similar to the table in RDBMS. Each table (column family) is a collection of rows, each of which stores columns in an ordered manner. Column, therefore, is the basic unit of data in Cassandra. Each column is characterized by its name, value, and timestamp.

The difference between a SQL table and Cassandra’s table is that the latter is schema-free. You don’t need to define your column structure ahead of time. As a result, each row in a Cassandra table may have columns with different names and variable numbers.

Python Data Presistence - Python - Cassandra chapter 12 img 2

Installation

The latest version of Cassandra is available for download athttp://cassandra. apache.org/download/. Community distributions of Cassandra (DDC) can be found at https://academy.datastax.com/planet-cassandra/cassandra. Code examples in this chapter are tested on DataStax distribution installed on Windows OS.

Just as any relational database uses SQL for performing operations on data in tables, Cassandra has its own query language CQL which stands for Cassandra Query Language. The DataStax distribution comes with a useful front-end IDE for CQL. All operations such as creating keyspace and table, running different queries, and so on can be done both visually as well as using text queries. The following diagram shows a view of DataStax DevCenter IDE.(figure 12.3)

Python Data Presistence - Python - Cassandra chapter 12 img 3

CQL Shell

Cassandra installation also provides a shell inside which you can execute CQL queries. It is similar to> MySQL console, SQLite console, or Oracle’s SQL Plus terminal. (figure 12.4)

Python Data Presistence - Python - Cassandra chapter 12 img 4

We shall first learn to perform basic CRUD operations with Cassandra from inside CQLSH and then use Python API for the purpose.

Inserting Rows

INSERT statement in CQL is exactly similar to one in SQL. However, the column list before the ‘VALUES’ clause is not optional as is the case in SQL. That is because, in Cassandra, the table may have a variable number of columns.

cq1sh:mykeyspace> insert into products (productID, name, price) 
values (1, 1 Laptop 1,25000);

Issue INSERT statement multiple numbers of times to populate ‘products’ table with sample data given in chapter 9. You can also import data from a CSV file using the copy command, as follows:

cq1sh:mykeyspace> copy products (productID, name, price)
                             . . . from ’pricelist.csv’ with 
the delimiter,’ and header=true;

 

Python Data Persistence – PyMongo – Relationships

Python Data Persistence – PyMongo – Relationships

MongoDB is a non-relational database. However, you can still establish relationships between documents in a database. MongoDB uses two different approaches for this purpose. One is an embedded approach and the other is a referencing approach.

Embedded Relationship

In this case, the documents appear in a nested manner where another document is used as the value of a certain key. The following code represents a ‘customer’ document showing a customer (with ‘_ id ’= 1) buying two products. A list of two product documents is the value of the ‘prods’ key.

Example

>>> cust.insert_one({'_id':1,'name':'Ravi',
                                          'prods':[
                                                 { 'Name':'TV',
'price':40000},


{'Name':'Scanner','price':5000}
                                ]
              })

Querying such an embedded document is straightforward as all data is available in the parent document itself.

Example

>>> doc=cust .find_one ({ '_id' : 1}, { 'prods ' : 1})
>>> doc
{' id': 1, 'prods': [{'Name': 'TV', 'price': 40000},
{'Name': 'Scanner', 'price': 5000}]}

The embedded approach has a major drawback. The database is not normalized and, hence, data redundancy arises. Assize grows, it may affect the performance of reading/write operations.

Reference Relationship

This approach is somewhat similar to the relations in a SQL-based database. The collections (equivalent to the RDBMS table) are normalized for optimum performance. One document refers to the other with its ‘_id’ key.

Recollecting that instead of automatically generated random values for ‘_id’, they can be explicitly specified while inserting a document in a collection, following is the constitution of ‘products’ collection.

Example

>>> list (prod.find ( ) '
[{ '_4d' : 1, 'Name' 'Laptop', 'price': 25000}, {'_
id': 2, Name': 'TV', 'price': 40000}, {'_id': 3,
'Name': Router', price': 2000}, {'_id': 4, 'Name':
'Scanner , 'price' 5000}, {' id': 5, 'Name':
'Printer , 'price' 9000}]

We now create a ‘customers’ collection.

Example

>>> db.create_collection('customers 1) 
>>> cust=db['customers']

The following document is inserted with one key ‘prods’ being a list of ‘ _id’ s from products collection.

Example

>>> cust .insert_one({'_id':1, 'Name' 'Ravi',
'prods': [2,4] })

However, in such a case, you may have to run two queries: one on the parent collection, and another on related collection. First, fetch the _ids of the related table.

Example

>>> doc=cust .find_one ({'_id' : 1}, {'prods ' : 1})
>>> prods
[2, 4]

Then, iterate over the list and access the required field from the related document.

Example

>>> for each in prods:
doc=prod .find_one ({ ' id' :each})
print (doc['Name'])
TV
Scanner

The reference approach can be used to build one-to-one or one-to-many types of relationships. The choice of approach (embedded or reference) largely depends on data usage, the projected growth of the size of the document, and the atomicity of the transaction.

In this chapter, we had an overview of the MongoDB database and its Python interface in the form of the PyMongo module. In the next chapter, another NoSQL database – Cassandra – is going to be explained along with its association with Python.

Python Data Presistence – PyMongo – Add Collection

Python Data Presistence – PyMongo – Add Collection

Create a new database object by using any name currently not in the list.

Example

>>> db=client.newdb

The Database is actually created when the first document is inserted. The following statement will implicitly create a ‘products’ collection and multiple documents from the given list of dictionary objects.

Example

>>> pricelist=[{ 'ProductID : 1 / 'Name' :'Laptop',
'price' 25000},{ 'ProductID :2, 'Name' :'TV',
'price' 40000},{ 'ProductID : 3, 'Name' :'Router',
'price' 2000},{' ProductID' 4, 'Name': 'Scanner',
'price' 'price' 5000},{' 9000}] ProductID' 5, Name': 'Printer',
>>> db.products.insert_many(pricelist)

You can confirm the insertion operation by find ( ) method in the mongo shell, as we have done earlier.

We create a collection object explicitly by using the create_collection( ) method of the database object.

Example

>>> db.create_collection(1 customers')

Now, we can add one or more documents in it. The following script adds documents in ‘customers’ collection.

Example

from pymongo import MongoClient client=MongoClient() 
db=client.newdb db.create_collection("customers") 
cust=db['customers' ]
 custlist=[{'CustID':1,1 Name' :'Ravikumar', 'GS- TIN':'27AAJPL7103N1ZF'}, 
{'CustID' :2, 'Name': 'Patel', 'GSTIN': '24ASDFG1234N- 1ZN'}, 
{'CustID' : 3, ' Name' : 'Nitin', 'GSTIN': '27AABBC7895N- 1ZT'}, 
{'CustID' : 4, ' Name' : 'Nair',' GSTIN':' 32MMAF8963N1ZK'} 
{'CustID' :5, ' Name' : 'Shah',' GSTIN':' 24BADEF2002N- 1ZB'}, 
{'CustID' :6, ' Name' : 'Khurana .', 'GSTIN ':'07KABCS1002N- lZV'}, 
{'CustID' :7, ' Name' : 'Irfan', 'GSTIN': '05IIAAV5103N- 1ZA'}, 
{'CustID' :8, ' Name' : 'Kiran', 'GSTIN': '12PPSD- F22431ZC' 
{'CustID' }. :9, ' Name' : 'Divya', 'GSTIN': '15ABCDE1101N- 1ZA'}, 
{'CustID' : 10 ,'Name' :'John', 'GS- TIN':'29AAEEC4258E1ZK'}] 
cust.insert_many(custlist) / client.close()