50+ Android Interview Questions and Answers [Updated 2021-22]

Android Interview Questions & Answers for Freshers, 2 Years Experience, 4 Years Expreience and more

1) What is Android?

Android is an open-source, Linux-based operating system used in mobiles, tablets, televisions, etc.

2) Who is the founder of Android?

Andy Rubin.

3) Explain the Android application Architecture.

Following is a list of components of Android application architecture:

  • Services: Used to perform background functionalities.
  • Intent: Used to perform the interconnection between activities and the data passing mechanism.
  • Resource Externalization: strings and graphics.
  • Notification: light, sound, icon, notification, dialog box and toast.
  • Content Providers: It will share the data between applications.

4) What are the code names of android?

  1. Aestro
  2. Blender
  3. Cupcake
  4. Donut
  5. Eclair
  6. Froyo
  7. Gingerbread
  8. Honeycomb
  9. Ice Cream Sandwich
  10. Jelly Bean
  11. KitKat
  12. Lollipop
  13. Marshmallow

5) What are the advantages of Android?

Open-source: It means no license, distribution and development fee.

Platform-independent: It supports Windows, Mac, and Linux platforms.

Supports various technologies: It supports camera, Bluetooth, wifi, speech, EDGE etc. technologies.

Highly optimized Virtual Machine: Android uses a highly optimized virtual machine for mobile devices, called DVM (Dalvik Virtual Machine).

6) Does android support other languages than java?

Yes, an android app can be developed in C/C++ also using android NDK (Native Development Kit). It makes the performance faster. It should be used with Android SDK.

7) What are the core building blocks of android?

The core building blocks of Android are:

  • Activity
  • View
  • Intent
  • Service
  • Content Provider
  • Fragment etc.

8) What is activity in Android?

Activity is like a frame or window in java that represents GUI. It represents one screen of android.

9) What are the life cycle methods of android activity?

There are 7 life-cycle methods of activity. They are as follows:

  1. onCreate()
  2. onStart()
  3. onResume()
  4. onPause()
  5. onStop()
  6. onRestart()
  7. onDestroy()


More details…

10) What is intent?

It is a kind of message or information that is passed to the components. It is used to launch an activity, display a web page, send SMS, send email, etc. There are two types of intents in android:

  1. Implicit Intent
  2. Explicit Intent

11) How are view elements identified in the android program?

View elements can be identified using the keyword findViewById.

12) Define Android toast.

An android toast provides feedback to the users about the operation being performed by them. It displays the message regarding the status of operation initiated by the user.

13) Give a list of impotent folders in android

The following folders are declared as impotent in android:

  • AndroidManifest.xml
  • build.xml
  • bin/
  • src/
  • res/
  • assets/

14) Explain the use of ‘bundle’ in android?

We use bundles to pass the required data to various subfolders.

15) What is an application resource file?

The files which can be injected for the building up of a process are called as application resource file.

16) What is the use of LINUX ID in android?

A unique Linux ID is assigned to each application in android. It is used for the tracking of a process.

17) Can the bytecode be written in java be run on android?


18) List the various storages that are provided by Android.

The various storage provided by android are:

  • Shared Preferences
  • Internal Storage
  • External Storage
  • SQLite Databases
  • Network Connection

19) How are layouts placed in Android?

Layouts in Android are placed as XML files.

20) Where are layouts placed in Android?

Layouts in Android are placed in the layout folder.

21) What is the implicit intent in android?

The Implicit intent is used to invoke the system components.

22) What is explicit intent in android?

An explicit intent is used to invoke the activity class.

23) How to call another activity in android?

  1. Intent i = new Intent(getApplicationContext(), ActivityTwo.class);
  2. startActivity(i);

24) What is service in android?

A service is a component that runs in the background. It is used to play music, handle network transaction, etc.

25) What is the name of the database used in android?

SQLite: An opensource and lightweight relational database for mobile devices.

26) What is AAPT?

AAPT is an acronym for android asset packaging tool. It handles the packaging process.

27) What is a content provider?

A content provider is used to share information between Android applications.

28) What is fragment?

The fragment is a part of Activity by which we can display multiple screens on one activity.

29) What is ADB?

ADB stands for Android Debug Bridge. It is a command line tool that is used to communicate with the emulator instance.

30) What is NDK?

NDK stands for Native Development Kit. By using NDK, you can develop a part of an app using native language such as C/C++ to boost the performance.

31) What is ANR?

ANR stands for Application Not Responding. It is a dialog box that appears if the application is no longer responding.

32) What is the Google Android SDK?

The Google Android SDK is a toolset which is used by developers to write apps on Android-enabled devices. It contains a graphical interface that emulates an Android-driven handheld environment and allows them to test and debug their codes.

33) What is an APK format?

APK is a short form stands for Android Packaging Key. It is a compressed key with classes, UI’s, supportive assets and manifest. All files are compressed to a single file is called APK.

34) Which language does Android support to develop an application?

Android applications are written by using the java (Android SDK) and C/C++ (Android NDK).

35) What is ADT in Android?

ADT stands for Android Development Tool. It is used to develop the applications and test the applications.

36) What is View Group in Android?

View Group is a collection of views and other child views. It is an invisible part and the base class for layouts.

37) What is the Adapter in Android?

An adapter is used to create a child view to present the parent view items.

38) What is nine-patch images tool in Android?

We can change bitmap images into nine sections with four corners, four edges, and an axis.

39) Which kernel is used in Android?

Android is a customized Linux 3.6 kernel.

40) What is application Widgets in Android?

Application widgets are miniature application views that can be embedded in other applications and receive periodic updates.

41) Which types of flags are used to run an application on Android?

Following are two types of flags to run an application in Android:


42) What is a singleton class in Android?

A singleton class is a class which can create only an object that can be shared by all other classes.

43) What is sleep mode in Android?

In sleep mode, CPU is slept and doesn’t accept any commands from android device except Radio interface layer and alarm.

44) What do you mean by a drawable folder in Android?

In Android, a drawable folder is compiled a visual resource that can use as a background, banners, icons, splash screen, etc.

45) What is DDMS?

DDMS stands for Dalvik Debug Monitor Server. It gives the wide array of debugging features:

  1. Port forwarding services
  2. Screen capture
  3. Thread and heap information
  4. Network traffic tracking
  5. Location data spoofing

46) Define Android Architecture?

The Android architecture consists of 4 components:

  1. Linux Kernal
  2. Libraries
  3. Android Framework
  4. Android Applications

47) What is a portable wi-fi hotspot?

The portable wi-fi hotspot is used to share internet connection to other wireless devices.

48) Name the dialog box which is supported by Android?

  • Alert Dialog
  • Progress Dialog
  • Date Picker Dialog
  • Time picker Dialog

49) Name some exceptions in Android?

  • Inflate Exception
  • Surface.OutOfResourceException
  • SurfaceHolder.BadSurfaceTypeException
  • WindowManager.BadTokenException

50) What are the basic tools used to develop an Android app?

  • JDK
  • Eclipse+ADT plugin
  • SDK Tools

Python Handwritten Notes and Study Material PDF Free Download


Python Handwritten Notes: Python is a popular programming language with dynamic semantics. It is well used for rapid application development because of its high-level data structures. It is also used as a glue language to connect various components. Python Programing has a simple and easy syntax that offers cost-effective programming maintenance. Programmers prefer the python language because it provides increased productivity. It is easier to debug in Python than most programming languages. If you want to learn a programming language, Python is one of the best options to start with. Experienced programmers can learn Python very fast as the language is simple and easy. In this article, you will find complete details about the study materials and Lecture Notes of the python language.

Introduction to Python

Basics of Python

Simple and Compound Statements

Data Structures

Modules and Packages

Object Oriented Programming


File Handling

Introduction to Python Notes

Python is a high-level, easily understandable, and simple programming language. It was created by G. V. Rossum and released in 1991. Python focuses on easy readability of the codes. It supports different kinds of paradigms, like structured, functional, and object-oriented programming. Python has a comprehensive standard library. Python is a popular language because of its easy-to-learn approach. Python’s code syntaxes use English keywords that make it easily understandable. Many employers from all over the world look for python developers which makes it popular among learners.

Python Handwritten Notes and Study Material Free PDF Download

Python is an easy language, but you must read the right study materials to understand the language. We have made a list of important study materials for Python which you can download by clicking on the links given below. One tutorial material for Python is also included here. We have also included important questions and answers in one of the study materials.

Python tutorial pdf with exercises Download
python programming book pdf Download
learn python free Pdf Download
python programming Question Paper Download

Python Reference Books

Python is a popular language used in back end development. Many popular organizations like Google, Netflix, Spotify, and Instagram use Python. This language is so widely popular because it is an interpreted language. Python is used universally in a wide range of applications. Programmers prefer this language because it provides them with the option to explore more instead of focusing on solving complexities. We have made a list of books that will help you to learn Python.

  • Python Programming: A Modern Approach, Vamsi Kurama, Pearson
  • Learning Python, Mark Lutz, Orielly
  • Think Python, Allen Downey, Green Tea Press
  • Core Python Programming, W.Chun, Pearson.
  • Introduction to Python, Kenneth A. Lambert, Cengage

Python Curriculum

There are various types of courses available for the Python programming language. So, the syllabus of Python varies depending on the type of course. It is suggested that you should know the syllabus before starting the course. It will be easier for you to understand this language well after knowing the syllabus. We have discussed the basics of the syllabus for Python below. You should check out the syllabus of your institution for the python course.


Introduction: History of Python, Running Python Scripts, Variables, Assignment, Need of Python Programming, Applications Basics of Python Programming Using the REPL(Shell), Input-Output, Indentation, Keywords.


Types, Operators, and Expressions: Types – Strings, Booleans, Integers; Operators- Arithmetic Operators,

Assignment Operators, Logical Operators, Bitwise Operators, Membership Operators, Comparison (Relational) Operators, Identity Operators, Expressions and order of evaluations Control Flow- if, continue, pass, if-elif-else, for, while break.


Data Structures Lists – Operations, Methods; Tuples, Sets, Dictionaries, Slicing, Sequences. Comprehensions.


Functions – Defining Functions, Passing Arguments, Keyword Arguments, Calling Functions, Default Arguments,

Variable-length arguments, Anonymous Functions, Fruitful Functions (Function Returning Values), Scope of the Variables in a Function- Global and Local Variables. Modules: Creating modules, import statements, from. The import statement, namespacing, Python packages, Installing Packages via PIP, Using Python Packages, Introduction to PIP.


Object-Oriented Programming OOP in Python: Classes, ‘ self-variable’, Methods, Constructor Method, Inheritance,

Overriding Methods, Data hiding, Error, and Exceptions: Difference between an error and Exception, Handling, Exception, try except for block, Raising Exceptions, User-Defined Exceptions.


Brief Tour of the Standard Library – Operating System Interface – String Pattern Matching, Mathematics, Data Compression, Multithreading, GUI Programming, Turtle Graphics Testing: Why testing is required ?, Internet Access, Dates and Times, Basic concepts of testing, Unit testing in Python, Writing Test cases, Running Tests.

List of Python Important Questions

Some common questions related to Python are discussed below:

  • Explain the features of IDLE usability.
  • Which type of keywords are used in Python?
  • What are four built-in numeric data types in Python?
  • Explain Python’s jump statements with references.
  • Give a brief explanation of dictionaries in Python.
  • What are tuples in Python?
  • Explain anonymous functions examples.
  • What is the usage of modules? How to structure a program?
  • How are classes created in Python?
  • Explain error and exception. Write the differences between these two features.
  • Why is testing required?
  • Explain the following terms: i) Calendar module ii) Synchronizing threads.

FAQs on Python Handwritten Notes

Question 1.
Why is Python so popular?

Python has a simple and easy syntax that offers cost-effective programming maintenance. Python is popular among programmers because this language provides increased productivity.

Question 2.
Is C++ better than Python?

Python uses English keywords, while C++ uses complicated syntaxes. Python is popular in machine learning and data analysis. C++ is preferred for game development and large systems.

Question 3.
Is Python good for the future?

Within a few years, Python has become one of the popular programming languages worldwide. The demand for python language will be there in future as well.

Question 4.
How long does it take to learn the python language?

If you’re from a programming background, it will take very less time to learn Python. However, students from the non-programming background can also learn Python within 2 to 3 months because of its easy readability.


Python is an interesting and productive language used by many big tech firms. It is also easy to learn this programming language because of its simple syntax. Hopefully, the above information about this language will help you to understand this language better.

Java MySQL Database Connectivity with Example

In this tutorial, you will learn how to connect to MySQL database in java. Here you will learn step by step guide to connect java application with a MySQL database. In order to connect MySQL database from Java, you need JDBC API. By JDBC, you will connect to any type of database like MySQL, Oracle etc.

Steps for java database connectivity:

1. Import the package:

you will need to import the package that includes all the JDBC classes and interface needed for database programming. Using import java.mysql.*;

2. Load and Register Driver:

To load and register the Driver you need to load a Driver class which is available in MySQL jar file. You need to download this jar file and import in your project.

Download Jar File: Download JDBC Driver

The forName() method is used to load and register the Driver.This forName() method is available in class Class.This method is a static method and dynamically load the Driver class.
Syntax of forName() method:

public static void forName(String className) throws classNotFoundException.

3. Create the Connection object:

The getConnection() method is responsible to connect java application to database.It is a static method of DriverManager class.This method contain 3 parameter “url”,”name” and “password”.
Syntax of getConnection() method:
public static Connection getConnection(String url,String name,String password)
throws SQLException

4. Create the Statement object:

By using the Statement object we can send our SQL Query to database. The createStatement() method is used to create the statement. It is a method of connection interface.
Syntax of createStatement() method:

public Statement createStatement() throws SQLException

5. Execute the Query:

The executeQuery() method is used to execute queries to the database. It is a method of statement interface. Database Engine will execute a SQL Query and result in a place in a box. This box is called ResultSet.Resultset holds the result of the SQL Query. Java application can get all the result of SQL Query from ResultSet.
Syntax of executeQuery() method:

public ResultSet executeQuery(String url) throws SQLException.

6. Close the Connection:

The Final step is to close the connection by using the close() method. It is a method of connection interface. By closing the connection object ResultSet will be closed automatically.
Syntax of close() method:

public void close() throws SQLException

This is my database named employee and it contains a table called training. I want to connect to a MySQL database using java.
connect to a MySQL database using java

Java MySQL Database Connectivity with Example

import java.sql.*;
import java.util.*;
class Example{
public static void main(String args[])throws Exception{
System.out.println("Driver loaded");
Connection con = DriverManager.getConnection("jdbc:mysql://localhost:3306/emp_record","root"," ");
System.out.println("Connection Established");

Statement st = con.createStatement();
ResultSet rs = st.executeQuery("select * from traning");

System.out.println("Rollno  Student_Name  Stream   Percentage");
System.out.println(rs.getInt(1)+ " " +rs.getString(2)+ " " +rs.getString(3)+ " "+rs.getFloat(4));
Java MySQL Database Connectivity with Example

Java JDBC Driver | Types of Java Database Connectivity Drivers and Comparson

JDBC Driver is a software component that is used to interact with the java application with the database. The purpose of JDBC driver is to convert java calls into database-specific calls and database-specific calls into java calls.

Types of JDBC Drivers:

  1. JDBC-ODBC bridge driver.
  2. Native API driver
  3. Network protocol driver.
  4. Pure java driver.

1) JDBC-ODBC bridge Driver :

  • This driver is also known as Type-1 Driver. Internally this Driver will take the support of ODBC Driver to communicate with the database.Type-1 Driver convert JDBC calls into ODBC calls and ODBC Driver convert ODBC calls into database-specific calls.
  • Using Type-1 Driver for prototyping only and not for production purposes.
  • This driver can be provided by Sun Microsystem as a part of JDK.

JDBC-ODBC bridge Driver

Advantages :

  • Type-1 Driver is database-independent driver.
  • it is very easy to use.
  • Not require to install this driver separately(By default in windows)

Disadvantages :

  • It is the slowest driver.
  • Type-1 driver internally depends upon ODBC driver so ODBC driver concept application only on window machine i.e. platform-dependent driver.

2) Native API Driver :

  • Native API driver converts JDBC calls into database-specific native libraries calls and these calls are directly understood by the database engine.
  • Large database vendors, such as Oracle and IBM, use the Type-2 driver for their enterprise databases.
  • Type-2 Drivers aren’t architecturally compatible.
  • Type-2 Drivers force developers to write platform-specific code.

Native API Driver

Advantages :

  • Good performance as compared to the type-1 driver.
  • No ODBC Driver require.
  • Type-2 Drivers are operating system-specific and compiled.

Disadvantages :

  • It is a database-dependent driver.
  • It is a platform-dependent driver.
  • Only Oracle provides type-2 Driver

3) Network protocol driver:

  • For database middle-ware, Type-3 JDBC drivers are pure Java drivers.
  • java application communicates with Network Protocol driver. Network protocol driver converts JDBC calls into middle-wear specific calls, the middle-wear server communicates with database, middle-wear server convert middle-wear specific calls into database-specific calls.

Network protocol driver

Advantages :

  • This driver does not directly communicate with the database. So it is database-independent driver. For any database, this driver is the same.
  • It is fully written in Java. So it is platform-independent driver.
    No client-side libraries are required.


  • Network support is required on the client machine.
  • When we change the database, we need to change the middle-wear code.
  • Maintenance of Network Protocol driver becomes costly.

4) Pure Java Driver:

  • It is also known as the Thin driver. The thin driver converts JDBC calls into database-specific calls directly.
  • Thin driver directly communicates with the database by using database-specific native protocol provided by the database vendor.
  • It is a platform, independent Driver.

Pure Java Driver


  • Better performance than all other drivers.
  • Platform independent Driver.
  • No software is required at the client-side or server side.


  • Database dependent driver, because it is directly communicating with the database direct.

Which JDBC Drivers should be used :

In our application, if you are using only one type of database may be an oracle,
MySQL then we should go for the type-1 driver.

Example: Standalone application,Small scale web application.

In our application, if you are using multiple databases highly recommended, you should go for the type-3 driver. Because the type-3 driver is database independent driver.

Example: large scale web application,Enterprise application.

To connect to multiple databases, developers can use type-2 Driver.

if no driver is available then we should go for type-4 driver

Differentiate between Thick and Thin Driver:

Differentiate between Thick and Thin Driver
Differentiate between Thick and Thin Driver

Thick Driver:

if the database driver requires some extra component to communicate with database such type of driver is called the Thick driver.

Example: Type-1,Type-2,Type-3 Driver.

Thin Driver:

if the database driver does not require some extra component to communicate with database such type of driver is called the Thin driver.

Example: Type-4 Driver.

Comparison of all types of JDBC Drivers:

Comparison of all types of JDBC Drivers

How to Create a Timeline with Pure CSS

Designing a vertical timeline structure using pure CSS3 and HTML.

Creating Timeline using Pure CSS and HTML


  1. To demonstrate flexibility using CSS
  2. Making a static timeline component using HTML and CSS.


Step 1: Creating the skeleton HTML file

Designing the basic HTML skeleton file and naming the classes where CSS have to be implemented is the basic first step.

I have divided it into the main wrapper i.e. timeline then content and time inside the <body> tag

It is demonstrated below.

With respect to this class(es) and the tags I have written the CSS styling as shown from the next steps.

<div class=”timeline”>
<div class=”content”>
<h3>What is Lorem Ipsum?</h3>
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sedo
eiusmod tempor incididunt ut labore et dolore magna aliqua.Ut
enim ad minim veniam, quis nostrud exercitation ullamclaboris
nisi ut aliquip ex ea commodo consequat. Duis autiruredolorin
reprehenderit in voluptate velit esse cillum dolore eu fugiat
nulla pariatur. Excepteur sint occaecat cupidatat noproident,
sunt in culpa qui officia deserunt mollit anim id estlaborum.
<div class=”time”>
<h4>April 2021</h4>

Step 2: Vertical design using the selector :before

I had to decide between creating a vertical timeline or a horizontal one. I opted for a horizontal design structure as shown below.

.timeline::before {
content: “”;
position: absolute;
left: 50%;
width: 2px;
height: 100%;
background: black;

Step 3: Designing the <ul> and <li> and creating variations.

.timeline ul {
margin: 0;
padding: 0;
}.timeline ul li {
line-height: none;
position: relative;
width: 50%;
padding: 20px 40px;
box-sizing: border-box;

The <ul> and the <li> tags are pretty straightforward, after to make variations for them to appear on an interval of left and right dynamically. I have used the :nth-child(even) and :nth-child(odd) property in CSS

.timeline ul li:nth-child(odd):before {
content: “”;
position: absolute;
top: 25px;
right: -6px;
width: 10px;
height: 10px;
background: rgba(206, 122, 26, 1);
border-radius: 50%;
box-shadow: 0 0 0 3px rgba(206, 122, 26, 0.2);
}.timeline ul li:nth-child(even):before {
content: “”;
position: absolute;
top: 25px;
left: -4px;
width: 10px;
height: 10px;
background: rgba(206, 122, 26, 1);
border-radius: 50%;
box-shadow: 0 0 0 3px rgba(206, 122, 26, 0.2);}


Creating Timeline using Pure CSS and HTML

Improve Page Rendering Speed using CSS

Consumers love fast websites as it takes less time to load and show the consumers their desired content as fast as possible. Even if there are numerous amount of media and animations the page should be snappy and fast.

Optimizing CSS for faster page loads

Using CSS attributes to render the web pages as fast as possible.


  1. Use different CSS attributes to make the page as much snappy as possible with less rendering time.

Work Process

Step 1: Take a Single Web Page or Make one

Designing a web page for demonstration is so much simpler with basic HTML files and CSS files linked with it for styling purposes. For this particular demonstration, I used a template website which is linked in the references below. I chose this template as it has many animations and sections.

Take a Single Web Page or Make one

Step 2: Looking for the parts that are below the viewport of our screen

By looking through the template I found that everything below the carousel is out of our viewport, so it’s not necessary to render it ahead of time but will render as soon as the user scrolls to that section.

Step 3: Add content-visibility property to the parts that are outside of the viewport

Everything outside the viewport that is about, testimonials, etc, there added an extra property of content-visibility: auto.

Add content-visibility property to the parts that are outside of the viewport

Content Visibility skips the rendering of the elements that are outside of the viewport and explicitly modifies the height to 0px before rendering but as soon as the user scrolls to that part it renders back to the original height. Without the property, the webpage was taking approximately 100ms to render but after adding that particular property we can reduce the rendering time to about 82ms. It’s small but an improvement. The figures show before(left) and after(right) adding the property.

Add content-visibility property to the parts that are outside of the viewport 2

As it is a fairly new property, sometimes on some browsers it’s a bit janky as it makes the height to 0px, to set a minimum height before full rendering contain-intrinsic-size: <size> is used that will set the elements to a specific size before rendering. By using this property we reduce a little bit more rendering time(below).

Add content-visibility property to the parts that are outside of the viewport 3

Step 4: Add will-change property to the parent element of animation/ transitions

By looking in the template of the website I see multiple blocks where there are transitions. I added the will-change property to the parent element of those blocks for less rendering time.

Change the CSS import module by using link CSS tags inside the HTML document

Step 5: Change the CSS import module by using link CSS tags inside the HTML document

If I use the import module inside the .css files to import multiple other CSS files it acts as respective loading the files which refer to loading one after another.

But to further improve the loading time we can import those at HTML document for parallel loading that means all the files will at a certain time rather than respectively.

Change the CSS import module by using link CSS tags inside the HTML document 2


There are numerous ways to reduce the render time but the three mentioned give us the best result. Especially the content-visibility with contain-intrinsic-size gave use the best result. One thing to keep in mind that all the reduction of the render time we implemented are without using any javascript

About Us

BTechGeeks.com provides e-learning solutions for K-12 students in form of downloadable PDFs, online tests, practice sets, videos and homework help. It is one of the most trusted websites among BTech students and teachers.

BTechGeeks.com provides solutions for the web-based education system and develops e-learning software products for the virtual education market in India. The content in BTech Geeks has been prepared by teachers with more than 10 years of teaching experience in schools.

BTechGeeks.com is a student-centric educational web portal that provides quality test papers and study materials for the students preparing for Board Exams or targeting various entrance exams. During the past few years, a number of surveys on students were made to better understand their problems regarding their studies and their basic requirement. This website is basically a conclusive solution to the surveys. Test and study materials are according to the student’s needs.

We are working for free education so that all the students can have access to the content and use it to get successful in their lives. We are trying to provide maximum help in the field of BTech Syllabus Solutions, Material, Test Papers, Assignments, Study material of different subjects. Till now we are providing free study material, book solutions, notes, sample papers and much more.

How to Scrape Wikipedia Articles with Python

How to Scrape Wikipedia Articles with Python

We are going to make a scraper which will scrape the wikipedia page.

The scraper will get directed to the wikipedia page and then it will go to the random link.

I guess it will be fun looking at the pages the scraper will go.

Setting up the scraper:

Here, I will be using Google Colaboratory, but you can use Pycharm or any other platform you want to do your python coding.

I will be making a colaboratory named Wikipedia. If you will use any python platform then you need to create a .py file followed by the any name for your file.

To make the HTTP requests, we will be installing the requests module available in python.

Pip install requests

 We will be using a wiki page for the starting point.

Import requests

Response = requests.get(url = "https://en.wikipedia.org/wiki/Web_scraping")


 When we run the above command, it will show 200 as a status code.

How to Scrape Wikipedia Articles with Python 1

Okay!!! Now we are ready to step on the next thing!!

Extracting the data from the page:

We will be using beautifulsoup to make our task easier. Initial step is to install the beautiful soup.

Pip install beautifulsoup4

Beautiful soup allows you to find an element by the ID tag.

Title = soup.find( id=”firstHeading”)

 Bringing everything together, our code will look like:

How to Scrape Wikipedia Articles with Python 3

As we can see, when the program is run, the output is the title of the wiki article i.e Web Scraping.

 Scraping other links:

Other than scraping the title of the article, now we will be focusing on the rest of the things we want.

We will be grabbing <a> tag to another wikipedia article and scrape that page.

To do this, we will scrape all the <a> tags within the article and then I will shuffle it.

Do not forget to import the random module.

How to Scrape Wikipedia Articles with Python 3

You can see, the link is directed to some other wikipedia article page named as IP address.

Creating an endless scraper:

Now, we have to make the scraper scrape the new links.

For doing this, we have to move everything into scrapeWikiArticle function.

How to Scrape Wikipedia Articles with Python 4

The function scrapeWikiArticle will extract the links and and title. Then again it will call this function and will create an endless cycle of scrapers that bounce around the wikipedia.

After running the program, we got:

How to Scrape Wikipedia Articles with Python 5

Wonderful! In only few steps, we got the “web scraping” to “Wikipedia articles with NLK identifiers”.


We hope that this article is useful to you and you learned how to extract random wikipedia pages. It revolves around wikipedia by following random links.

How to Code a Scraping Bot with Selenium and Python

How to Code a Scraping Bot with Selenium and Python

Selenium is a powerful tool for controlling web browsers through programs and performing browser automation. Selenium is also used in python for scraping the data. It is also useful for interacting with the page before collecting the data, this is the case that we will discuss in this article.

In this article, we will be scraping the investing.com to extract the historical data of dollar exchange rates against one or more currencies.

There are other tools in python by which we can extract the financial information. However, here we want to explore how selenium helps with data extraction.

The Website we are going to Scrape:

Understanding of the website is the initial step before moving on to further things.

Website consists of historical data for the exchange rate of dollars against euros.

In this page, we will find a table in which we can set the date range which we want.

That is the thing which we will be using.

We only want the currencies exchange rate against the dollar. If that’s not the case then replace the “usd” in the URL.

The Scraper’s Code:

The initial step is starting with the imports from the selenium, the Sleep function to pause the code for some time and the pandas to manipulate the data whenever necessary.

How to Code a Scraping Bot with Selenium and Python

Now, we will write the scraping function. The function will consists of:

  • A list of currency codes.
  • A start date.
  • An End date.
  • A boolean function to export the data into .csv file. We will be using False as a default.

We want to make a scraper that scrapes the data about the multiple currencies. We also have to initialise the empty list to store the scraped data.

How to Code a Scraping Bot with Selenium and Python 1

As we can see that the function has the list of currencies and our plan is to iterate over this list and get the data.

For each currency we will create a URL, instantiate the driver object, and we will get the page by using it.

Then the window function will be maximized but it will only be visible when we will keep the option.headless as False.

Otherwise, all the work will be done by the selenium without even showing you.

How to Code a Scraping Bot with Selenium and Python 2

Now, we want to get the data for any time period.

Selenium provides some awesome functionalities for getting connected to the website.

We will click on the date and fill the start date and end dates with the dates we want and then we will hit apply.

We will use WebDriverWait, ExpectedConditions, and By to make sure that the driver will wait for the elements we want to interact with.

The waiting time is 20 seconds, but it is to you whichever the way you want to set it.

We have to select the date button and it’s XPath.

The same process will be followed by the start_bar, end_bar, and apply_button.

The start_date field will take in the date from which we want the data.

End_bar will select the date till which we want the data.

When we will be done with this, then the apply_button will come into work.

How to Code a Scraping Bot with Selenium and Python 3

Now, we will use the pandas.read_html file to get all the content of the page. The source code of the page will be revealed and then finally we will quit the driver.

How to Code a Scraping Bot with Selenium and Python 4

How to handle Exceptions In Selenium:

The collecting data process is done. But selenium is sometimes a little unstable and fail to perform the function we are performing here.

To prevent this we have to put the code in the try and except block so that every time it faces any problem the except block will be executed.

So, the code will be like:

for currency in currencies:

        while True:


                # Opening the connection and grabbing the page

                my_url = f'https://br.investing.com/currencies/usd-{currency.lower()}-historical-data'

                option = Options()

                option.headless = False

                driver = webdriver.Chrome(options=option)




                # Clicking on the date button

                date_button = WebDriverWait(driver, 20).until(






                # Sending the start date

                start_bar = WebDriverWait(driver, 20).until(






                # Sending the end date

                end_bar = WebDriverWait(driver, 20).until(







                # Clicking on the apply button

                apply_button = WebDriverWait(driver,20).until(







                # Getting the tables on the page and quiting

                dataframes = pd.read_html(driver.page_source)


                print(f'{currency} scraped.')





                print(f'Failed to scrape {currency}. Trying again in 30 seconds.')



For each DataFrame in this dataframes list, we will check if the name matches, Now we will append this dataframe to the list we assigned in the beginning.

Then we will need to export a csv file. This will be the last step and then we will be over with the extraction.

How to Code a Scraping Bot with Selenium and Python 5

Wrapping up:

This is all about extracting the data from the website.So far this code gets the historical data of the exchange rate of a list of currencies against the dollar and returns a list of DataFrames and several .csv files.


How To Scrape LinkedIn Public Company Data – Beginners Guide

How To Scrape LinkedIn Public Company Data

Nowadays everybody is familiar with how big the LinkedIn community is. LinkedIn is one of the largest professional social networking sites in the world which holds a wealth of information about industry insights, data on professionals, and job data.

Now, the only way to get the entire data out of LinkedIn is through Web Scraping.

Why Scrape LinkedIn public data?

There are multiple reasons why one wants to scrape the data out of LinkedIn. The scrape data can be useful when you are associated with the project or for hiring multiple people based on their profile while looking at their data and selecting among them who all are applicable and fits for the company best.

This scraping task will be less time-consuming and will automate the process of searching for millions of data in a single file which will make the task easy.

Another benefit of scraping is when one wants to automate their job search. As every online site has thousands of job openings for different kinds of jobs, so it must be hectic for people who are looking for a job in their field only. So scraping can help them automate their job search by applying filters and extracting all the information at only one page.

In this tutorial, we will be scraping the data from LinkedIn using Python.


In this tutorial, we will use basic Python programming as well as some python packages- LXML and requests.

But first, you need to install the following things:

  1. Python accessible here (https://www.python.org/downloads/)
  2. Python requests accessible here(http://docs.python-requests.org/en/master/user/install/)
  3. Python LXML( Study how to install it here: http://lxml.de/installation.html)

Once you are done with installing here, we will write the python code to extract the LinkedIn public data from company pages.

This below code will only run on python 2 and not above them because the sys function is not supported in it.

import json

import re

from importlib import reload

import lxml.html

import requests

import sys



HEADERS = {'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',

          'accept-encoding': 'gzip, deflate, sdch',

          'accept-language': 'en-US,en;q=0.8',

          'upgrade-insecure-requests': '1',

          'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.85 Safari/537.36'}

file = open('company_data.json', 'w')




def increment():

   global COUNT


def fetch_request(url):


       fetch_url = requests.get(url, headers=HEADERS)



           fetch_url = requests.get(url, headers=HEADERS)



               fetch_url = requests.get(url, headers=HEADERS)


               fetch_url = ''

   return fetch_url

def parse_company_urls(company_url):

   if company_url:

       if '/company/' in company_url:



           parent_url = company_url


           if fetch_company_url:

               sel = lxml.html.fromstring(fetch_company_url.content)

               COMPANIES_XPATH = '//div[@class="section last"]/div/ul/li/a/@href'

               companies_urls = sel.xpath(COMPANIES_XPATH)

               if companies_urls:

                   if '/company/' in companies_urls[0]:

                       print('Parsing From Category ', parent_url)


                   for company_url in companies_urls:




def parse_company_data(company_data_url):

   if company_data_url:

       fetch_company_data = fetch_request(company_data_url)

       if fetch_company_data.status_code == 200:


               source = fetch_company_data.content.decode('utf-8')

               sel = lxml.html.fromstring(source)

               # CODE_XPATH = '//code[@id="stream-promo-top-bar-embed-id-content"]'

               # code_text = sel.xpath(CODE_XPATH).re(r'<!--(.*)-->')

               code_text = sel.get_element_by_id(


               if len(code_text) > 0:

                   code_text = str(code_text[0])

                   code_text = re.findall(r'<!--(.*)-->', str(code_text))

                   code_text = code_text[0].strip() if code_text else '{}'

                   json_data = json.loads(code_text)

                   if json_data.get('squareLogo', ''):

                       company_pic = 'https://media.licdn.com/mpr/mpr/shrink_200_200' + \

                                     json_data.get('squareLogo', '')

                   elif json_data.get('legacyLogo', ''):

                       company_pic = 'https://media.licdn.com/media' + \

                                     json_data.get('legacyLogo', '')


                       company_pic = ''

                   company_name = json_data.get('companyName', '')

                   followers = str(json_data.get('followerCount', ''))

                   # CODE_XPATH = '//code[@id="stream-about-section-embed-id-content"]'

                   # code_text = sel.xpath(CODE_XPATH).re(r'<!--(.*)-->')

                   code_text = sel.get_element_by_id(


               if len(code_text) > 0:

                   code_text = str(code_text[0]).encode('utf-8')

                   code_text = re.findall(r'<!--(.*)-->', str(code_text))

                   code_text = code_text[0].strip() if code_text else '{}'

                   json_data = json.loads(code_text)

                   company_industry = json_data.get('industry', '')

                   item = {'company_name': str(company_name.encode('utf-8')),

                           'followers': str(followers),

                           'company_industry': str(company_industry.encode('utf-8')),

                           'logo_url': str(company_pic),

                           'url': str(company_data_url.encode('utf-8')), }



                   file = open('company_data.json', 'a')






fetch_company_dir = fetch_request('https://www.linkedin.com/directory/companies/')

if fetch_company_dir:

   print('Starting Company Url Scraping')


   sel = lxml.html.fromstring(fetch_company_dir.content)

   SUB_PAGES_XPATH = '//div[@class="bucket-list-container"]/ol/li/a/@href'

   sub_pages = sel.xpath(SUB_PAGES_XPATH)

   print('Company Category URL list')



   if sub_pages:

       for sub_page in sub_pages: