Python Data Persistence – PyMongo – Relationships

Python Data Persistence – PyMongo – Relationships

MongoDB is a non-relational database. However, you can still establish relationships between documents in a database. MongoDB uses two different approaches for this purpose. One is an embedded approach and the other is a referencing approach.

Embedded Relationship

In this case, the documents appear in a nested manner where another document is used as the value of a certain key. The following code represents a ‘customer’ document showing a customer (with ‘_ id ’= 1) buying two products. A list of two product documents is the value of the ‘prods’ key.

Example

>>> cust.insert_one({'_id':1,'name':'Ravi',
                                          'prods':[
                                                 { 'Name':'TV',
'price':40000},


{'Name':'Scanner','price':5000}
                                ]
              })

Querying such an embedded document is straightforward as all data is available in the parent document itself.

Example

>>> doc=cust .find_one ({ '_id' : 1}, { 'prods ' : 1})
>>> doc
{' id': 1, 'prods': [{'Name': 'TV', 'price': 40000},
{'Name': 'Scanner', 'price': 5000}]}

The embedded approach has a major drawback. The database is not normalized and, hence, data redundancy arises. Assize grows, it may affect the performance of reading/write operations.

Reference Relationship

This approach is somewhat similar to the relations in a SQL-based database. The collections (equivalent to the RDBMS table) are normalized for optimum performance. One document refers to the other with its ‘_id’ key.

Recollecting that instead of automatically generated random values for ‘_id’, they can be explicitly specified while inserting a document in a collection, following is the constitution of ‘products’ collection.

Example

>>> list (prod.find ( ) '
[{ '_4d' : 1, 'Name' 'Laptop', 'price': 25000}, {'_
id': 2, Name': 'TV', 'price': 40000}, {'_id': 3,
'Name': Router', price': 2000}, {'_id': 4, 'Name':
'Scanner , 'price' 5000}, {' id': 5, 'Name':
'Printer , 'price' 9000}]

We now create a ‘customers’ collection.

Example

>>> db.create_collection('customers 1) 
>>> cust=db['customers']

The following document is inserted with one key ‘prods’ being a list of ‘ _id’ s from products collection.

Example

>>> cust .insert_one({'_id':1, 'Name' 'Ravi',
'prods': [2,4] })

However, in such a case, you may have to run two queries: one on the parent collection, and another on related collection. First, fetch the _ids of the related table.

Example

>>> doc=cust .find_one ({'_id' : 1}, {'prods ' : 1})
>>> prods
[2, 4]

Then, iterate over the list and access the required field from the related document.

Example

>>> for each in prods:
doc=prod .find_one ({ ' id' :each})
print (doc['Name'])
TV
Scanner

The reference approach can be used to build one-to-one or one-to-many types of relationships. The choice of approach (embedded or reference) largely depends on data usage, the projected growth of the size of the document, and the atomicity of the transaction.

In this chapter, we had an overview of the MongoDB database and its Python interface in the form of the PyMongo module. In the next chapter, another NoSQL database – Cassandra – is going to be explained along with its association with Python.