Python Data Persistence – xml Package

Python Data Persistence – xml Package

XML is another well-known data interchange format, used by a large number of applications. One of the main features of extensible Markup Language (XML) is that its format is both human-readable and human-readable. XML is widely used by applications of web services, office tools, and Service-Oriented Architectures (SOA).

Standard Python library’s xml package consists of modules for XML processing, as per different models. In this section, we discuss the ElementTree module that provides a simple and lightweight API for XML processing.
The XML document is arranged in a tree-like hierarchical format. The document tree comprises of elements. Each element is a single node in the tree and has an attribute enclosed in <> and </> tags. Each element may have one or more sub-elements following the same structure.

A typical XML document appears, as follows:

Example

<?xml version="l.0" encoding="iso-8859-l"?> 
<pricelist>
     <product>
              <name >TV</name >
              <brand>Samsung</brand>
              <price>25000</price>
   </product>
   <product>
             <name >Computer</name >
             <brand>Dell</brand>
             <price>40000</price>
   </product>
   <product>
            <name >Mobile </name >
            <brand>Redmi</brand>
            <price>15000</price>
   </product>
</pricelist>

The elementary module’s class structure also has Element and SubElement objects. Each Element has a tag and attrib which is a diet object. For the root element, attrib is an empty dictionary.

Example

>>> import xml.etree.ElementTree as xmlobj
>>> root=xmlobj.Element('PriceList')
>>> root.tag
'PriceList'
>>> root.attrib
{ }

Now, we can add one or more nodes, i.e., elements under root. Each Element object may have SubElements, each having an attribute and text property.

Let us set up the ‘product’ element and ‘name’, ‘brand’, and ‘price’ as its sub-elements.

Example

>>> product=xmlobj.Element('Product')
>>> nm=xmlobj.SubElement(product, 'name')
>>> nm.text='name'
>>> brand=xmlobj.SubElement(product, 'brand')
>>> nm.text='TV'
>>> brand.text='Samsung'
>>> price=xmlobj.SubElement(product, 'price')
>>> price.text='25000'

The root node has an append ( ) method to add this node to it.

Example

>>> root.append(product)

Construct a tree from this root object and write its contents to the XML file.

Example

>> > tree=xmlobj.ElementTree (root)
> > > file=open ( ' pricelist. xml ,'wb')
>>> tree. write (file)
>>> file . close ( )

The ‘pricelist.xmP should be visible in the current working directory. The following script writes a list of dictionary objects to the XML file:

Example

import xml.etree.ElementTree as xmlobj
root=xmlobj.Element(1PriceList')
pricelist=[{'name' :'TV', 'brand1 :'Sam¬sung ' , 'price' : '250001 } ,
{'name' :'Computer', 'brand' :1 Dell1, 'pri ce' : '400001 } ,
{'name' :1 Mobile', 'brand':'Red- mi 1 price':'150001}]
i = 0
for row in pricelist:
i = i + 1
print (i)
element=xmlobj.Element('Product'+str(i))
for k,v in row.items():
sub=xmlobj.SubElement(element, k)
sub.text=v
root.append(element)
tree=xmlobj.ElementTree(root)
file=open ( ' pricelist. xml' , 1 wb ' )
tree . write (file)
file . close ()

To parse the XML file, construct document tree giving its name as file parameter in ElementTree constructor.

Example

import xml.etree.ElementTree as xmlobj
tree = xmlobj .ElementTree (file='pricelist .xml' )

The getroot() method of tree object fetches root element and getchildren() returns a list of elements below it.

Example

root = tree.getroot( ) 
children = root.getchildren()

We can now construct a dictionary object corresponding to each subelement by iterating over the sub-element collection of each child node.

Example

for child in children:
product={ }
pairs = child.getchildren()
for pair in pairs:
product[pair.tag]=pair.text

Each dictionary is then appended to a list returning original list of dictionary objects. Complete code parsing XML file into a list of dictionaries is as follows:

Example

import xml.etree.ElementTree as xmlobj
tree = xmlobj .ElementTree (file='pricelist .xml' )
root = tree.getroot()
products= [ ]
children = root.getchildren() for child in children:
product={ }
pairs = child.getchildren()
for pair in pairs:
product[pair.tag]=pair.text products.append(product)
print (products)

Save the above script from ‘xmlreader.py’ and run it from the command line:

E:\python37 >python xmlreader.py [{'name': 'TV', 'brand': 'Samsung', 'price':
'250001 }, { 'name' : 'Computer' , 'brand' : 'Dell' ,
'price': '40000'}, {'name': 'Mobile', 'brand':
'Redmi', 'price': '15000'}]

Of other modules in xml package, xml. dom implements document object model of XML format and xm‘1. sax defines functionality to implement SAX model.