Python Interview Questions on HTML/XML in Python

Python Interview Questions on HTML/XML in Python

Question 1:
What module is used to handle URLs?

Question 2:
Illustrate parsing a URL into a tuple.
Import urlparse
myTuple = urlparse.urlparse( “ myfile.html”)

Question 3:
What modules does Python provide for opening and fetching data from URLs?
urllib and urllib2.

Question 4:
What is the main difference between the urllib and urllib2 modules?
The urllib2 module can accept a Request object, which allows the specification of headers.

Question 5:
Illustrate opening and printing a web-based URL.
import urllib ,
myURL =urllib.urlopen(“http:/”)
myBuffer = )
print myBuffer

Question 6:
How is the requested information retrieved?
Using the .info( ) method on an open URL.

Question 7:
To process the contents of an HTML file, what module is used?

Question 8:
How is the HTMLParser used?
By subclassing the HTMLParser.HTMLParser class, and inserting processing for the tags of interest, instantiating it, then by calling the .feed method of the resulting object.

Question 9:
What modules are used to manage cookies?
The urllib and cookielib modules.

Question 10:
Illustrate retrieving a cookies from a URL.
Import urllib2
Import cookielib
myjar= cookielib.LWPCookieJar( )
myOpener = urllib2.build_opener(
urllib2. ins tall_opener( myOpener)
myRequest =urllib2. RequestC’http:!/”)
myHTML = urllib2.urlopen(my Request)
for cookie in enumerate(myjar)
print cookie

Question 11:
In managing XML documents, what is the module most often used?
xml.dom Also, the minidom class within xml.dom

Question 12:
Illustrate opening an xml document.
from xml.dom import minidom myTree = minidom.parse(‘myXML.xml’) print myTree.toxmlO

Question 13:
What tools are available to determine if an xml file is well formed?
Using a try, except block, and attempting to parse the xml file through the generic parser available in xml.sax

Question 14:
How are xml element attributes accessed?
Using the minidom parser, then the .getAttribute method on returned child nodes.

Question 15:
What method is used to determine if a child node includes a particular attribute?
Using the .hasAttribute method which will return true if the attribute is defined.

Question 16:
What is the difference between the .toxml and .toprettyxml methods?
The .toprettyxml method will indent the node contents appropriately.

Question 17:
What is expat module and what is it used for?
The expat module is a non-validating XML parser. It is used to process XML very quickly, with minimal regard for the formal correctness of the XML being parsed.

Question 18:
What does the .dom stand for in the xml.dom module?
Document Object Model.

Question 19:
What parsers are available for XML?
Sax, expat, and minidom

Question 20:
Illustrate direct node access in an XML file.
from xml.dom import minidom myXML = minidom.parse(”myXML.xml”)
myNodes = myXML.childNode s print childNodes[0].toprettyxml()