Python random choice no repeat – Random Choice of Random Module in Python with no Repeat

Random Choice of Random Module in Python with no Repeat

Python random choice no repeat: Given the upper limit and lower limit, the task is to generate n natural numbers which are not repeating in Python.

Examples:

Example1:

Input:

Given N=13
Given lower limit range =19
Given upper limit range =45

Output:

The random numbers present in the range from 19 to 45 are :
28 40 24 25 20 44 38 29 21 31 43

Example2:

Input:

Given N=19
Given lower limit range =23
Given upper limit range =41

Output:

The random numbers present in the range from 23 to 41 are : 26 27 40 38 37 41 30 35 36 23 25

Random choice of Random module in Python with no Repeat

Below are the ways to generate n natural numbers which are not repeating in Python.

Practice Java programming from home without using any fancy software just by tapping on this Simple Java Programs for Beginners tutorial.

Method #1: Using For Loop and randint function (Static Input)

Approach:

  • Import the random module using the import keyword.
  • Give the number n as static input and store it in a variable.
  • Give the lower limit range and upper limit range as static input and store them in two separate variables.
  • Take an empty list (say rndmnumbs) and initialize it with an empty list using [] or list().
  • Loop till n times using For loop.
  • Generate a random number using randint(lowerlimitrange,upperlimitrange) and store it in a variable.
  • Check whether the above random number is present in the list or not using not in operator.
  • If it is not in the list then append the element to the rndmnumbs list using the append() function.
  • Print the rndmnumbs.
  • The Exit of the Program.

Below is the implementation:

# Import the random module using the import keyword.
import random
# Give the number n as static input and store it in a variable.
numbe = 13
# Give the lower limit range and upper limit range as static input
# and store them in two separate variables.
lowerlimitrange = 19
upperlimitrange = 45
# Take an empty list (say rndmnumbs) and initialize it with an empty list
# using [] or list().
rndmnumbs = []
# Loop till n times using For loop.
for m in range(numbe):
        # Generate a random number using randint(lowerlimitrange,upperlimitrange)
    # and store it in a variable.
    randomnumbe = random.randint(lowerlimitrange, upperlimitrange)
    # Check whether the above random number is present in the list or not
    # using not in operator.
    if randomnumbe not in rndmnumbs:
        # If it is not in the list then append the element
        # to the rndmnumbs list using the append() function.
        rndmnumbs.append(randomnumbe)

# Print the rndmnumbs
print('The random numbers present in the range from',
      lowerlimitrange, 'to', upperlimitrange, 'are :')
for q in rndmnumbs:
    print(q, end=' ')

Output:

The random numbers present in the range from 19 to 45 are :
28 40 24 25 20 44 38 29 21 31 43

Method #2: Using For Loop and randint function (User Input)

Approach:

  • Import the random module using the import keyword.
  • Give the number n as user input using int(input()) and store it in a variable.
  • Give the lower limit range and upper limit range as user input using map(),int(),split(),input() functions.
  • Store them in two separate variables.
  • Take an empty list (say rndmnumbs) and initialize it with an empty list using [] or list().
  • Loop till n times using For loop.
  • Generate a random number using randint(lowerlimitrange,upperlimitrange) and store it in a variable.
  • Check whether the above random number is present in the list or not using not in operator.
  • If it is not in the list then append the element to the rndmnumbs list using the append() function.
  • Print the rndmnumbs.
  • The Exit of the Program.

Below is the implementation:

# Import the random module using the import keyword.
import random
# Give the number n as user input using int(input()) and store it in a variable.
numbe = int(input('Enter some random number = '))
# Give the lower limit range and upper limit range as user input
# using map(),int(),split(),input() functions.
# Store them in two separate variables.
lowerlimitrange = int(input('Enter random lower limit range = '))
upperlimitrange = int(input('Enter random upper limit range = '))
# Take an empty list (say rndmnumbs) and initialize it with an empty list
# using [] or list().
rndmnumbs = []
# Loop till n times using For loop.
for m in range(numbe):
        # Generate a random number using randint(lowerlimitrange,upperlimitrange)
    # and store it in a variable.
    randomnumbe = random.randint(lowerlimitrange, upperlimitrange)
    # Check whether the above random number is present in the list or not
    # using not in operator.
    if randomnumbe not in rndmnumbs:
        # If it is not in the list then append the element
        # to the rndmnumbs list using the append() function.
        rndmnumbs.append(randomnumbe)

# Print the rndmnumbs
print('The random numbers present in the range from',
      lowerlimitrange, 'to', upperlimitrange, 'are :')
for q in rndmnumbs:
    print(q, end=' ')

Output:

Enter some random number = 19
Enter random lower limit range = 23
Enter random upper limit range = 41
The random numbers present in the range from 23 to 41 are :
26 27 40 38 37 41 30 35 36 23 25

Related Programs:

Get first key in dictionary python – Get first key-value pair from a Python Dictionary

Finding first key-value pair from a Dictionary in Python

Get first key in dictionary python: In this article, we will see of different methods by which we can fetch the first key-value pair of a dictionary. We will also discuss how to get first N pairs and any number of key-value pairs from a dictionary.

Getting first key-value pair of dictionary in python using iter() & next() :

In python, iter() function creates a iterator object of the the iterable sequence of key-value pairs from dictionary and by calling next() function we can get the first key-value pair.

# Program :

dict_eg = {
    'Sachin' : 10,
    "Gayle"  : 333,
    'Kohil'  : 18,
    'Murali' : 800,
    'Dhoni'  : 7,
    'AB'     : 17
}
# Get the first key-value pair in dictionary
dict_eg = next(iter((dict_eg.items())) )
print('The first Key Value Pair in the Dictionary is:')
print(dict_eg)
print('First Key: ', dict_eg[0])
print('First Value: ', dict_eg[1])
Output :
The first Key Value Pair in the Dictionary is:
('Sachin', 10)
First Key:  Sachin
First Value:  10

Get first key-value pair of dictionary using list :

In python, items() function in dictionary returns the iterable sequence of all key-value pairs. Then by creating a list from all key-value pairs in dictionary and by selecting the first item we will get first key-value pair of dictionary.

# Program :

dict_eg = {
    'Sachin' : 10,
    "Gayle"  : 333,
    'Kohil'  : 18,
    'Murali' : 800,
    'Dhoni'  : 7,
    'AB'     : 17
}
# Get the first key-value pair in dictionary
dict_eg = list(dict_eg.items())[0]
print('First Key Value Pair of Dictionary:')
print(dict_eg)
print('Key: ', dict_eg[0])
print('Value: ', dict_eg[1])
Output :
First Key Value Pair of Dictionary:
('Sachin', 10)
Key:  Sachin
Value:  10

Getting the first N key-value pair of dictionary in python using list & slicing :

Here from similar process, we will create a list of key-value pairs from dictionary. We can get first ‘N’ items by list[:N] or any items by list[start:end].

# Program :

dict_eg = {
    'Sachin' : 10,
    "Gayle"  : 333,
    'AB'     : 17,
    'Murali' : 800,
    'Dhoni'  : 7,
    'AB'     : 17,
    'Kohil'  : 18
}
n = 5
# Get first 5 pairs of key-value pairs
firstN_pairs = list(dict_eg.items())[:n]
print('The first 5 Key Value Pairs of Dictionary are:')
for key,value in firstN_pairs:
    print(key, '::', value)
Output :
The first 5 Key Value Pairs of Dictionary are:
Dhoni :: 7
Kohil :: 18
Gayle :: 333
Sachin :: 10
AB :: 17

Getting the first N key-value pair of dictionary in python using itertools :

We can slice first ‘N’ entries from a sequence by itertools.islice(iterable, stop) after creating key-value pairs sequence from items() function.

# Program :

import itertools
dict_eg = {
    'Sachin' : 10,
    "Gayle"  : 333,
    'AB'     : 17,
    'Murali' : 800,
    'Dhoni'  : 7,
    'AB'     : 17,
    'Kohil'  : 18
}
n = 5
# Get first 5 pairs of key-value pairs
firstN_pairs = itertools.islice(dict_eg.items(), n)
print('The first 5 Key Value Pairs of Dictionary are:')
for key,value in firstN_pairs:
    print(key, '::', value)
Output :
The first 5 Key Value Pairs of Dictionary are:
Murali :: 800
AB :: 17
Sachin :: 10
Dhoni :: 7
Kohil :: 18

Python decimal module – Python decimal Module with Examples

decimal Module with Examples

Python decimal module: We come across the necessity to seek functions to conduct mathematical operations in any domain. The Python decimal module provides us with all of the mathematical functions we require.

decimal Module in Python:

The Python decimal module includes a number of functions for dealing with numeric data and performing various mathematical operations on it. Using the decimal module, we can efficiently handle decimal numbers throughout our program.

The decimal module contains utilities for controlling and overcoming the precision problem in decimal values.

Before using the module, we should import it first.

import decimal

Functions Present in decimal Module

To improve the outcome, various arithmetic operations can be done on the decimal or numeric data.

Using the decimal, we may define decimal point numbers.

As demonstrated below, the Decimal() function is:

Syntax:

import decimal
variable = decimal.Decimal(decimal_number)

Use decimal.getcontext().prec function which is an in-built function provided by the decimal module to control the precision value of the results of the decimal point number.

Syntax:

decimal.getcontext().prec = precision value

1)exp() Function:

This function returns the exponent value (e^x) of the given decimal number.

Syntax:

decimal.Decimal(decimal_num).exp()

Example

# Import decimal module using the import keyword
import decimal
# Give the precision value as static input using the decimal.getcontext().prec
# function and  store it in a variable.
decimal.getcontext().prec = 4
# Give the first decimal number as static input using the Decimal() function
# and store it in a variable.
fst_num = decimal.Decimal(15.53)
# Give the second decimal number as static input and store it in another variable.
scnd_num = decimal.Decimal(10.4)
# Calculate the sum of the given two decimal numbers and
# store it in another variable.
rslt_sum = fst_num + scnd_num
# Get the exponential value of the result sum obtained using the exp() function.
# (e^rslt_sum)
# Store it in a variable
exp_val = rslt_sum.exp()
# Pass some random decimal number to the Decimal() function(which stores number in
# the form of 32-bit) and store it in another variable.
nomath_samplnum = decimal.Decimal(5.132)
# Print the Sum of the given two decimal numbers
print("The Sum of the given two decimal numbers = ", rslt_sum)
# Print the Exponential value of the result sum obtained
print("The Exponential value of the result sum obtained = ", exp_val)
# Print the above given no math sample number.
print("The above given no math sample number = ")
print(nomath_samplnum)

Output:

The Sum of the given two decimal numbers =  25.93
The Exponential value of the result sum obtained =  1.825E+11
The above given no math sample number = 
5.1319999999999996731503415503539144992828369140625

Have you noticed that the total number of digits in our output is 4? This is due to the precision value that we specified here.

One thing to keep in mind is that the precision value only applies when performing mathematical operations on two decimals, not when immediately initiating a variable with the values provided above with the “nomath_samplnum ” variable.

2)sqrt() Function:

The sqrt() function computes and returns the square root value of the given decimal number.

Syntax:

decimal.Decimal(decimal_num).sqrt()

Example

# Import decimal module using the import keyword
import decimal
# Give the precision value as static input using the decimal.getcontext().prec
# function and  store it in a variable.
decimal.getcontext().prec = 4
# Give the decimal number as static input using the Decimal() function
# and store it in a variable.
gvn_num = decimal.Decimal(25.54)
# Get the square root value of the given decimal number using the sqrt() function.
# Store it in another variable
rslt_sqrtval = gvn_num.sqrt()
# Print the given decimal number.
print("The given decimal number = ", gvn_num)
# Print the square root value of the given decimal number.
print("The given decimal number's square root value = ", rslt_sqrtval)

Output:

The given decimal number =  25.53999999999999914734871708787977695465087890625
The given decimal number's square root value =  5.054

3)Logarithmic functions( ln(), log10() )

The Decimal module includes the following methods for calculating the logarithmic values of decimal point numbers–

  • decimal.ln()
  • decimal.log10()

The decimal.ln() function returns the natural log value of the given decimal number

Syntax:

decimal.Decimal(decimal_num).ln()

The decimal.log10() function computes the base 10 log value of the given decimal number.

Syntax:

decimal.Decimal(decimal_num).log10()

Example

# Import decimal module using the import keyword
import decimal
# Give the precision value as static input using the decimal.getcontext().prec
# function and  store it in a variable.
decimal.getcontext().prec = 4
# Give the decimal number as static input using the Decimal() function
# and store it in a variable.
gvn_num = decimal.Decimal(25.54)
# Calculate the natural Logarithmic value of the given decimal number
# using the ln() function.
# Store it in another variable
log_val = gvn_num.ln()
# Calculate the base 10 Logarithmic value of the given decimal number
# using the log10() function.
# Store it in another variable
base10_logval = gvn_num.log10()
# Print the given decimal number natural log value
print("The given decimal number natural log value = ", log_val)
# Print the given decimal number's base 10 log value
print("The given decimal number's base 10 log value = ", base10_logval)

Output:

The given decimal number natural log value =  3.240
The given decimal number's base 10 log value =  1.407

4)compare() Function in Python

The decimal.compare() method compares two decimal point integers and returns values based on the following conditions–

  • If the first decimal number is less than the second decimal number, it returns -1.
  • If the first decimal number is bigger than the second decimal number, it returns 1.
  • If the decimal point values are equal, this function returns 0.

Example

# Import decimal module using the import keyword
import decimal
# Give the precision value as static input using the decimal.getcontext().prec
# function and  store it in a variable.
decimal.getcontext().prec = 4
# Give the first decimal number as static input using the Decimal() function
# and store it in a variable.
fst_num = decimal.Decimal(15.53)
# Give the second decimal number as static input and store it in another variable.
scnd_num = decimal.Decimal(10.4)
# Compare both the given numbers using compare() function and
# store it in another variable.
comparisn_rslt = fst_num.compare(scnd_num)
# Print the comparision result.
print("The comparision result = ", comparisn_rslt)

Output:

The comparision result =  1

5)copy_abs() function:

The function decimal.copy_abs() returns the absolute values of the signed decimal number provided to it.

Syntax:

decimal.Decimal(signed decimal num).copy_abs()

Example

# Import decimal module using the import keyword
import decimal
# Give the decimal number as static input using the Decimal() function
# and store it in a variable.
gvn_num = decimal.Decimal(-16.345)
# Get the absolute value of the given nunber using the copy_abs() function
# and store it in another variable.
abs_val = gvn_num.copy_abs()
# Print the given decimal number's Absolute value.
print("The given decimal number's Absolute value = ", abs_val)

Output:

The given decimal number's Absolute value =  16.344999999999998863131622783839702606201171875

6)Min and Max Functions

The Python decimal module includes the functions listed below for calculating the minimum and maximum values of decimal point numbers.

  • The min() function returns the smallest/minimum of two decimal numbers.
  • The max() function returns the greatest/maximum of the two decimal numbers.

min() Function Syntax:

decimalnum_1.min(decimalnum_2)

max() function Syntax:

decimalnum_1.max(decimalnum_2)

Example

# Import decimal module using the import keyword
import decimal
# Give the first decimal number as static input using the Decimal() function
# and store it in a variable.
fst_num = decimal.Decimal(15.53)
# Give the second decimal number as static input and store it in another variable.
scnd_num = decimal.Decimal(10.4)
# Get the smallest value of the given two decimal numbers using the min() function
# Store it in another variable.
minimm_num = fst_num.min(scnd_num)
# Print the smallest value of the given two decimal numbers
print("The smallest value of the given two decimal numbers = ", minimm_num)
# Get the greatest value of the given two decimal numbers using the max() function
# Store it in another variable.
maxim_num = fst_num.max(scnd_num)
# Print the greatest value of the given two decimal numbers
print("The greatest value of the given two decimal numbers = ", maxim_num)

Output:

The smallest value of the given two decimal numbers =  10.40000000000000035527136788
The greatest value of the given two decimal numbers =  15.52999999999999936051153782

7)Logical Operations

The Decimal module includes a set of built-in functions for performing logical operations on decimal integers such as AND, OR, XOR, and so on.

logical_and() function: This function performs the logical AND operation on two decimal numbers and returns the result.

logical_or():
The method logical_or() performs a logical OR operation on two decimal values and returns the result.

logical_xor():
The function logical_xor() performs a logical XOR operation on two decimal values and returns the result.

Syntax:

# logical_and()
decimalnum_1.logical_and(decimalnum_2)

# logical_or()
decimalnum_1.logical_or(decimalnum_2)

# logical_xor
decimalnum_1.logical_xor(decimalnum_2)

Example:

# Import decimal module using the import keyword
import decimal
# Give the first decimal number as static input using the Decimal() function
# and store it in a variable.
fst_num = decimal.Decimal(1100)
# Give the second decimal number as static input and store it in another variable.
scnd_num = decimal.Decimal(1010)
# Calculate the logical AND value of the given two numbers using the logical_and()
# function and  store it in another variable.
logicl_AND = fst_num.logical_and(scnd_num)
print("The given two decimal number's logical AND value = ", logicl_AND)
# Calculate the logical OR value of the given two numbers using the logical_and()
# function and  store it in another variable.
logicl_OR = fst_num.logical_or(scnd_num)
print("The given two decimal number's logical OR value =  ", logicl_OR)
# Calculate the logical XOR value of the given two numbers using the logical_and()
# function and  store it in another variable.
logicl_XOR = fst_num.logical_xor(scnd_num)
print("The given two decimal number's logical XOR value = ", logicl_XOR)

Output:

The given two decimal number's logical AND value =  1000
The given two decimal number's logical OR value =   1110
The given two decimal number's logical XOR value =  110

Python Program to Write to an Excel File Using openpyxl Module

Python Program to Write into an Excel File Using openpyxl Library

In this article we are going to see how we can write to an excel sheets by using Python language. To do this we will be using openpyxl library which is used to read and write to an excel file.

Python Program to Write to an Excel File Using openpyxl Module

In Python we have openpyxl library which is used to create, modify, read and write different types of excel files like xlsx/xlsm/xltx/xltm etc. When user operates with thousands of records in a excel file and want to pick out few useful information or want to change few records then he/she can do it very easily by using openpyxl library.

To use openpyxl library we first have to install it using pip.

Command : pip install openpyxl

After installation we can use the library to create and modify excel files.

Let’s see different programs to understand it more clearly.

Program-1: Python Program to Print the Name of an Active Sheet Using openpyxl Module

Approach:

  • First of all we have to import the openpyxl module
  • Then we use the workbook( ) function to create a workbook object
  • From the object we created, we extract the active sheet from the active attribute
  • Then we use the title attribute of the active sheet to fetch its title and print it.

Program:

# Import the openpyxl library
import openpyxl as opxl

# We will need to create a blank workbook first
# For that we can use the workbook() funtion available in th openpyxl library
workb = opxl.Workbook()

# Get the active workbook sheet from the active attribute
activeSheet = workb.active

# Extracting the sheet title from the activSheet object
sheetTitle = activeSheet.title

# Printing the sheet title
print("The active sheet name is : " + sheetTitle)

Output:

The active sheet name is : Sheet

Program-2: Python Program to Update the Name of an Active Sheet Using openpyxl Module

Approach:

  • First of all we have to import the openpyxl module.
  • Then we use the workbook( ) function to create a workbook object.
  • From the object we created, we extract the active sheet from the active attribute.
  • Then we use the title attribute of the active sheet to fetch its title and print it.
  • Now we store the new name in the title attribute of the active sheet.
  • Finally we fetch its title and print it.

Program:

# Import the openpyxl library
import openpyxl as opxl

# We will need to create a blank workbook first
# For that we can use the workbook() funtion available in th openpyxl library
workb = opxl.Workbook()

# Get the active workbook sheet from the active attribute
activeSheet = workb.active

# Fethcing the sheet title from the activSheet object
sheetTitle = activeSheet.title

# Printing the sheet title
print("The active sheet name is : " + sheetTitle)

# Updating the active sheet name
activeSheet.title = "New_Sheet_Name"

# Fethcing the sheet title from the activSheet object
sheetTitle = activeSheet.title

# Printing the new sheet title
print("The active sheet name after updation is : " + sheetTitle)

Output:

The active sheet name is : Sheet

The active sheet name after updation is : New_Sheet_Name

Program-3: Python Program to Write into an Excel Sheet Using openpyxl Module

Approach:

  • First of all we have to import the openpyxl module.
  • Then we use the workbook( ) function to create a workbook object.
  • From the object we created, we extract the active sheet from the active attribute.
  • Then we create cell objects from the active sheet object that store the row column coordinates. These cells can be called using row and column values or just the cell name like A1 *for row = 1 and column = 1).
  • Store some values in those cells.
  • Save the file using save( ) to make the changes permanent.
  • Now open the excel sheet to find the changes.

Program:

# Import the openpyxl library
import openpyxl as opxl

# We will need to create a blank workbook first
# For that we can use the workbook() funtion available in th openpyxl library
workb = opxl.Workbook()

# Get the active workbook sheet from the active attribute
activeSheet = workb.active

# Creating a cell object that contains attributes about rows, columns
# and coordinates to locate the cell
cell1 = activeSheet.cell(row=1, column=1)
cell2 = activeSheet.cell(row=2, column=1)

# Addind values to the cells
cell1.value = "Hi"
cell2.value = "Hello"

# Rather than writing the row and column number,
# we can also access the cells by their individual names
# C2 means third column 1st row
cell3 = activeSheet['C1']
cell3.value = "Gracias"

# C1 means third column 2nd row
cell3 = activeSheet['C2']
cell3.value = "Bye"

# FInally we have to save the file to save our changes
workb.save("E:\\Article\\Python\\file1.xlsx")

Output:

Python Program to Write into an Excel File Using openpyxl Library

Program-4: Python Program to Add more Sheets to the Active Workbook Using openpyxl Module

Approach:

  • First of all we have to import the openpyxl module.
  • Then we use the workbook( ) function to create a workbook object.
  • From the object we created, we extract the active sheet from the active attribute.
  • Then create a new sheet by using create_sheet() method.
  • Save the file by specifying the path.

Program:

# Import the openpyxl library
import openpyxl as opxl

# We will need to create a blank workbook first
# For that we can use the workbook() funtion available in th openpyxl library
workb = opxl.Workbook()

# Get the active workbook sheet from the active attribute
activeSheet = workb.active

# To add more sheets into the workbook we have to use the create_sheet() method
workb.create_sheet(index=1, title="2nd sheet")

# FInally we have to save the file to save our changes
workb.save("E:\\Article\\Python\\file1.xlsx")

Output:

Python Program to Write into an Excel File Using openpyxl Library

 

Python Program to Read an Excel File Using Openpyxl Module

Python Program to Read an Excel File Using openpyxl Library

In this article we are going to see how we can read excel sheets in Python language. To do this we will be using openpyxl library which is used to read and write to an excel file.

Python Program to Read an Excel File Using Openpyxl Module

In Python we have openpyxl library which is used to create, modify, read and write different types of excel files like xlsx/xlsm/xltx/xltm etc. When user operates with thousands of records in a excel file and want to pick out few useful information or want to change few records then he/she can do it very easily by using openpyxl library.

To use openpyxl library we first have to install it using pip.

Command : pip install openpyxl

After installation we can use the library to create and modify excel files.

Let’s see different programs to understand it more clearly.

Input File:

Python Program to Read and Excel File Using openpyxl Library

Program-1: Python Program to Print a Particular Cell Value of Excel File Using Openpyxl Module

Approach:

  • First of all we have to import the openpyxl module
  • Store the path to the excel workbook in a variable
  • Load the workbook using the load_workbook( ) function passing the path as a parameter
  • From the workbook object we created, we extract the active sheet from the active attribute
  • Then we create cell objects from the active sheet object
  • Print the value from the cell using the value attribute of the cell object

Program:

# Import the openpyxl library
import openpyxl as opxl

# Path to the excel file
path = "E:\\Article\\Python\\file1.xlsx"

# Created a workbook object that loads the workbook present
# at the path provided
wb = opxl.load_workbook(path)

# Getting the active workbook sheet from the active attribute
activeSheet = wb.active

# Created a cell object from the active sheet using the cell name
cell1 = activeSheet['A2']

# Printing the cell value
print(cell1.value)

Output:

Sejal

Program-2: Python Program to Print Total Number of Rows in Excel File Using Openpyxl Module

Approach:

  • First of all we have to import the openpyxl module
  • Store the path to the excel workbook in a variable
  • Load the workbook using the load_workbook( ) function passing the path as a parameter
  • From the workbook object we created, we extract the active sheet from the active attribute
  • Then we print the number of rows using the max_row attribute of the sheet object

Program:

# Import the openpyxl library
import openpyxl as opxl

# Path to the excel file
path = "E:\\Article\\Python\\file1.xlsx"

# Created a workbook object that loads the workbook present
# at the path provided
wb = opxl.load_workbook(path)

# Getting the active workbook sheet from the active attribute
activeSheet = wb.active

# Printing the number of rows in the sheet
print("Number of rows : ", activeSheet.max_row)

Output:

Number of rows :  7

Program-3: Python Program to Print Total Number of Columns in Excel File Using Openpyxl Module

Approach:

  • First of all we have to import the openpyxl module
  • Store the path to the excel workbook in a variable
  • Load the workbook using the load_workbook( ) function passing the path as a parameter
  • From the workbook object we created, we extract the active sheet from the active attribute
  • Then we print the number of columns using the max_column attribute of the sheet object

Program:

# Import the openpyxl library
import openpyxl as opxl

# Path to the excel file
path = "E:\\Article\\Python\\file1.xlsx"

# Created a workbook object that loads the workbook present
# at the path provided
wb = opxl.load_workbook(path)

# Getting the active workbook sheet from the active attribute
activeSheet = wb.active

# Printing the number of columns in the sheet
print("Number of columns : ", activeSheet.max_column)

Output:

Number of columns :  2

Program-4: Python Program to Print All Column Names of Excel File Using Openpyxl Module

Approach:

  • First of all we have to import the openpyxl module
  • Store the path to the excel workbook in a variable
  • Load the workbook using the load_workbook( ) function passing the path as a parameter
  • From the workbook object we created, we extract the active sheet from the active attribute
  • Then we find and store the number of columns in a variable cols
  • We run a for loop from 1 to cols+1 that creates cell objects and prints their value

Program:

# Import the openpyxl library
from ast import For
import openpyxl as opxl

# Path to the excel file
path = "E:\\Article\\Python\\file1.xlsx"

# Created a workbook object that loads the workbook present
# at the path provided
wb = opxl.load_workbook(path)

# Getting the active workbook sheet from the active attribute
activeSheet = wb.active

# Number of columns
cols = activeSheet.max_column

# Printing the column names using a for loop
for i in range(1, cols + 1):
    currCell = activeSheet.cell(row=1, column=i)
    print(currCell.value)

Output:

Name
Regd. No

Program-5: Python Program to Print First Column Value of Excel File Using Openpyxl Module

Approach:

  • First of all we have to import the openpyxl module.
  • Store the path to the excel workbook in a variable.
  • Load the workbook using the load_workbook( ) function passing the path as a parameter.
  • From the workbook object we created, we extract the active sheet from the active attribute.
  • Then we find and store the number of rows in a variable rows.
  • We run a for loop from 1 to rows+1 that creates cell objects and prints their value.

Program:

# Import the openpyxl library
from ast import For
import openpyxl as opxl

# Path to the excel file
path = "E:\\Article\\Python\\file1.xlsx"

# Created a workbook object that loads the workbook present
# at the path provided
wb = opxl.load_workbook(path)

# Getting the active workbook sheet from the active attribute
activeSheet = wb.active

# Number of rows
rows = activeSheet.max_row

# Printing the first column values using for loop
for i in range(1, rows + 1):
    currCell = activeSheet.cell(row=i, column=1)
    print(currCell.value)

Output:

Name
Sejal
Abhijit
Ruhani
Rahim
Anil
Satyam
Pushpa

Program-6: Python Program to Print a Particular Row Value of Excel File Using Openpyxl Module

Approach:

  • First of all we have to import the openpyxl module.
  • Store the path to the excel workbook in a variable.
  • Load the workbook using the load_workbook( ) function passing the path as a parameter.
  • From the workbook object we created, we extract the active sheet from the active attribute.
  • We use a variable rowNum to store the row number we want to read values from and a cols variable that stores the total number of columns.
  • We run a for loop from 1 to cols+1 that creates cell objects of the specified rows and prints their value.

Program:

# Import the openpyxl library
from ast import For
import openpyxl as opxl

# Path to the excel file
path = "E:\\Article\\Python\\file1.xlsx"

# Created a workbook object that loads the workbook present
# at the path provided
wb = opxl.load_workbook(path)

# Getting the active workbook sheet from the active attribute
activeSheet = wb.active

# Number of columns
cols = activeSheet.max_column

# The row number we want to print from
rowNum = 2

# Printing the row
for i in range(1, cols + 1):
    currCell = activeSheet.cell(row=rowNum, column=i)
    print(currCell.value)

Output:

Sejal 19012099

 

 

Python – Variables

Python Variables

Python is not a “statically typed” language. We do not need to declare variables or their types before using them. When we first assign a value to a variable, it is generated. A variable is a name that is assigned to a memory location. It is the fundamental storage unit in a program.

In this post, we’ll go over what you need to know about variables in Python.

Variables in Python Language

1)Variable

Variables are simply reserved memory locations for storing values. This means that when you construct a variable, you reserve memory space.

The interpreter allocates memory and specifies what can be stored in reserved memory based on the data type of a variable. As a result, you can store integers, decimals, or characters in variables by assigning various data types to them.

2)Important points about variables

  • In Python we don’t have to give the type of information when defining a variable, unlike the other programming languages (C++ or Java). The variable form is assumed by Python implicitly on the basis of a variable value.
  • During program execution, the value stored in a variable may be modified.
  • A variable is simply the name given to a memory location, all operations performed on the variable have an impact on that memory location.

3)Initializing the value of the variable

There is no clear statement to reserve the memory space for Python variables. When you assign a value to a variable, the declaration occurs automatically. To allocate values to the variables, the same sign (=) is used.

The = operator’s left operand is the variable name and the operand’s right is the value in the variable. The = operator is the variable value.

Examples:

A=100
b="Hello"
c=4.5

4)Memory and reference

A variable in Python resembles a tag or a reference that points to a memory object.

As an example,

k=”BTechGeeks”

‘BTechGeeks’ is an string object in the memory, and k is a reference or tag the indicates that memory object.

5)Modifying the variable value

Let us try this:

p=4.5
p="Cirus"

Initially, p pointed to a float object, but now it points to a string object in memory. The variable’s type also changed; originally, it was a decimal (float), but when we assigned a string object to it, the type of p changed to str, i.e., a string.

If there is an object in memory but no vector pointing to it, the garbage collector can automatically free it. We forced the variable p to point to a string object, as in the preceding example, and then float 4.5 was left in memory with no variable pointing to it. The object was then immediately released by the garbage collector.

6)Assigning one variable with another variable

We can assign the value of one variable with another variable like

p="BtechGeeks"
q=p

Both the p and q variables now point to the same string object, namely, ‘BTechGeeks.’

Below is the implementation:

p = "BTechGeeks"
# assign variable q with p
q = p
# print the values
print("The value of p :", p)
print("The value of q :", q)

Output:

The value of p : BTechGeeks
The value of q : BTechGeeks

7)The following are the rules for creating variables in Python

  • A variable name must begin with a letter or an underscore.
  • A number cannot be the first character in a variable name.
  • Variable names can only contain alphanumeric characters and underscores (A-z, 0-9, and _ ).
  • Case matters when it comes to variable names (flag, Flag and FLAG Aare three different variables).
  • The reserved terms (keywords) are not permitted to be used in naming the variable.

Related Programs:

Python Data Persistence – Using range

Python Data Persistence – Using range

Python’s built-in range ( ) function returns an immutable sequence of numbers that can be iterated over by for loop. The sequence generated by the range ( ) function depends on three parameters.

The start and step parameters are optional. If it is not used, then the start is always 0 and the step is 1. The range contains numbers between start and stop-1, separated by a step. Consider an example 2.15:

Example

range (10) generates 0 , 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9

range ( 1 , 5 ) results in 1 , 2 , 3 , 4

range ( 20 , 30 , 2 ) returns 20 , 22 , 24 , 26 , 28

We can use this range object as iterable as in example 2.16. It displays squares of all odd numbers between 11-20. Remember that the last number in the range is one less than the stop parameter (and step is 1 by default)

Example

#for-3.py
for num in range( 11 , 21 , 2 ):
sqr=num*num
print ( ' sqaure of { } is { } ' . format( num , sqr ) )

Output:

E:\python37>python for-3.py 
square of 11 is 121 
square of 13 is 169 
square of 15 is 225 
square of 17 is 289 
square of 19 is 361

In the previous chapter, you have used len ( ) function that returns a number of items in a sequence object. In the next example, we use len ( ) to construct a range of indices of items in a list. We traverse the list with the help of the index.

Example

#for-4.py
numbers=[ 4 , 7 , 2 , 5 , 8 ]
for indx in range(len(numbers)):
sqr=numbers[indx]*numbers[indx]
print ( ' sqaure of { } is { } ' . format ( numbers [ indx ] , sqr ) )

Output:

E:\python3 7 >python for - 4.py 
sqaure of 4 is 16 
sqaure of 7 is 49 
sqaure of 2 is 4 
sqaure of 5 is 25 
sqaure of 8 is 64 

E:\python37>

Have a look at another example of employing for loop over a range. The following script calculates the factorial value of a number. Note that the factorial of n (mathematical notation is n!) is the cumulative product of all integers between the range of 1 to n.

Example

#factorial.py
n=int ( input ( " enter number . . " ) )
#calculating factorial of n
f = 1
for i in range ( 1 , n+1 ):
f = f * i
print ( ' factorial of { } = { } ' . format ( n , f ) )

Output:

E:\python37>python factorial.py 
enter number..5 
factorial of 5 = 120

How To Scrape LinkedIn Public Company Data – Beginners Guide

How To Scrape LinkedIn Public Company Data

Nowadays everybody is familiar with how big the LinkedIn community is. LinkedIn is one of the largest professional social networking sites in the world which holds a wealth of information about industry insights, data on professionals, and job data.

Now, the only way to get the entire data out of LinkedIn is through Web Scraping.

Why Scrape LinkedIn public data?

There are multiple reasons why one wants to scrape the data out of LinkedIn. The scrape data can be useful when you are associated with the project or for hiring multiple people based on their profile while looking at their data and selecting among them who all are applicable and fits for the company best.

This scraping task will be less time-consuming and will automate the process of searching for millions of data in a single file which will make the task easy.

Another benefit of scraping is when one wants to automate their job search. As every online site has thousands of job openings for different kinds of jobs, so it must be hectic for people who are looking for a job in their field only. So scraping can help them automate their job search by applying filters and extracting all the information at only one page.

In this tutorial, we will be scraping the data from LinkedIn using Python.

Prerequisites:

In this tutorial, we will use basic Python programming as well as some python packages- LXML and requests.

But first, you need to install the following things:

  1. Python accessible here (https://www.python.org/downloads/)
  2. Python requests accessible here(http://docs.python-requests.org/en/master/user/install/)
  3. Python LXML( Study how to install it here: http://lxml.de/installation.html)

Once you are done with installing here, we will write the python code to extract the LinkedIn public data from company pages.

This below code will only run on python 2 and not above them because the sys function is not supported in it.

import json

import re

from importlib import reload

import lxml.html

import requests

import sys

reload(sys)

sys.setdefaultencoding('cp1251')




HEADERS = {'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',

          'accept-encoding': 'gzip, deflate, sdch',

          'accept-language': 'en-US,en;q=0.8',

          'upgrade-insecure-requests': '1',

          'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.85 Safari/537.36'}

file = open('company_data.json', 'w')

file.write('[')

file.close()

COUNT = 0




def increment():

   global COUNT

   COUNT = COUNT+1




def fetch_request(url):

   try:

       fetch_url = requests.get(url, headers=HEADERS)

   except:

       try:

           fetch_url = requests.get(url, headers=HEADERS)

       except:

           try:

               fetch_url = requests.get(url, headers=HEADERS)

           except:

               fetch_url = ''

   return fetch_url




def parse_company_urls(company_url):




   if company_url:

       if '/company/' in company_url:

           parse_company_data(company_url)

       else:

           parent_url = company_url

           fetch_company_url=fetch_request(company_url)

           if fetch_company_url:

               sel = lxml.html.fromstring(fetch_company_url.content)

               COMPANIES_XPATH = '//div[@class="section last"]/div/ul/li/a/@href'

               companies_urls = sel.xpath(COMPANIES_XPATH)

               if companies_urls:

                   if '/company/' in companies_urls[0]:

                       print('Parsing From Category ', parent_url)

                       print('-------------------------------------------------------------------------------------')

                   for company_url in companies_urls:

                       parse_company_urls(company_url)

           else:

               pass







def parse_company_data(company_data_url):




   if company_data_url:

       fetch_company_data = fetch_request(company_data_url)

       if fetch_company_data.status_code == 200:

           try:

               source = fetch_company_data.content.decode('utf-8')

               sel = lxml.html.fromstring(source)

               # CODE_XPATH = '//code[@id="stream-promo-top-bar-embed-id-content"]'

               # code_text = sel.xpath(CODE_XPATH).re(r'<!--(.*)-->')

               code_text = sel.get_element_by_id(

                   'stream-promo-top-bar-embed-id-content')

               if len(code_text) > 0:

                   code_text = str(code_text[0])

                   code_text = re.findall(r'<!--(.*)-->', str(code_text))

                   code_text = code_text[0].strip() if code_text else '{}'

                   json_data = json.loads(code_text)

                   if json_data.get('squareLogo', ''):

                       company_pic = 'https://media.licdn.com/mpr/mpr/shrink_200_200' + \

                                     json_data.get('squareLogo', '')

                   elif json_data.get('legacyLogo', ''):

                       company_pic = 'https://media.licdn.com/media' + \

                                     json_data.get('legacyLogo', '')

                   else:

                       company_pic = ''

                   company_name = json_data.get('companyName', '')

                   followers = str(json_data.get('followerCount', ''))




                   # CODE_XPATH = '//code[@id="stream-about-section-embed-id-content"]'

                   # code_text = sel.xpath(CODE_XPATH).re(r'<!--(.*)-->')

                   code_text = sel.get_element_by_id(

                       'stream-about-section-embed-id-content')

               if len(code_text) > 0:

                   code_text = str(code_text[0]).encode('utf-8')

                   code_text = re.findall(r'<!--(.*)-->', str(code_text))

                   code_text = code_text[0].strip() if code_text else '{}'

                   json_data = json.loads(code_text)

                   company_industry = json_data.get('industry', '')

                   item = {'company_name': str(company_name.encode('utf-8')),

                           'followers': str(followers),

                           'company_industry': str(company_industry.encode('utf-8')),

                           'logo_url': str(company_pic),

                           'url': str(company_data_url.encode('utf-8')), }

                   increment()

                   print(item)

                   file = open('company_data.json', 'a')

                   file.write(str(item)+',\n')

                   file.close()

           except:

               pass

       else:

           pass
fetch_company_dir = fetch_request('https://www.linkedin.com/directory/companies/')

if fetch_company_dir:

   print('Starting Company Url Scraping')

   print('-----------------------------')

   sel = lxml.html.fromstring(fetch_company_dir.content)

   SUB_PAGES_XPATH = '//div[@class="bucket-list-container"]/ol/li/a/@href'

   sub_pages = sel.xpath(SUB_PAGES_XPATH)

   print('Company Category URL list')

   print('--------------------------')

   print(sub_pages)

   if sub_pages:

       for sub_page in sub_pages:

           parse_company_urls(sub_page)

else:

   pass

How to Code a Scraping Bot with Selenium and Python

How to Code a Scraping Bot with Selenium and Python

Selenium is a powerful tool for controlling web browsers through programs and performing browser automation. Selenium is also used in python for scraping the data. It is also useful for interacting with the page before collecting the data, this is the case that we will discuss in this article.

In this article, we will be scraping the investing.com to extract the historical data of dollar exchange rates against one or more currencies.

There are other tools in python by which we can extract the financial information. However, here we want to explore how selenium helps with data extraction.

The Website we are going to Scrape:

Understanding of the website is the initial step before moving on to further things.

Website consists of historical data for the exchange rate of dollars against euros.

In this page, we will find a table in which we can set the date range which we want.

That is the thing which we will be using.

We only want the currencies exchange rate against the dollar. If that’s not the case then replace the “usd” in the URL.

The Scraper’s Code:

The initial step is starting with the imports from the selenium, the Sleep function to pause the code for some time and the pandas to manipulate the data whenever necessary.

How to Code a Scraping Bot with Selenium and Python

Now, we will write the scraping function. The function will consists of:

  • A list of currency codes.
  • A start date.
  • An End date.
  • A boolean function to export the data into .csv file. We will be using False as a default.

We want to make a scraper that scrapes the data about the multiple currencies. We also have to initialise the empty list to store the scraped data.

How to Code a Scraping Bot with Selenium and Python 1

As we can see that the function has the list of currencies and our plan is to iterate over this list and get the data.

For each currency we will create a URL, instantiate the driver object, and we will get the page by using it.

Then the window function will be maximized but it will only be visible when we will keep the option.headless as False.

Otherwise, all the work will be done by the selenium without even showing you.

How to Code a Scraping Bot with Selenium and Python 2

Now, we want to get the data for any time period.

Selenium provides some awesome functionalities for getting connected to the website.

We will click on the date and fill the start date and end dates with the dates we want and then we will hit apply.

We will use WebDriverWait, ExpectedConditions, and By to make sure that the driver will wait for the elements we want to interact with.

The waiting time is 20 seconds, but it is to you whichever the way you want to set it.

We have to select the date button and it’s XPath.

The same process will be followed by the start_bar, end_bar, and apply_button.

The start_date field will take in the date from which we want the data.

End_bar will select the date till which we want the data.

When we will be done with this, then the apply_button will come into work.

How to Code a Scraping Bot with Selenium and Python 3

Now, we will use the pandas.read_html file to get all the content of the page. The source code of the page will be revealed and then finally we will quit the driver.

How to Code a Scraping Bot with Selenium and Python 4

How to handle Exceptions In Selenium:

The collecting data process is done. But selenium is sometimes a little unstable and fail to perform the function we are performing here.

To prevent this we have to put the code in the try and except block so that every time it faces any problem the except block will be executed.

So, the code will be like:

for currency in currencies:

        while True:

            try:

                # Opening the connection and grabbing the page

                my_url = f'https://br.investing.com/currencies/usd-{currency.lower()}-historical-data'

                option = Options()

                option.headless = False

                driver = webdriver.Chrome(options=option)

                driver.get(my_url)

                driver.maximize_window()

                  

                # Clicking on the date button

                date_button = WebDriverWait(driver, 20).until(

                            EC.element_to_be_clickable((By.XPATH,

                            "/html/body/div[5]/section/div[8]/div[3]/div/div[2]/span")))

               

                date_button.click()

               

                # Sending the start date

                start_bar = WebDriverWait(driver, 20).until(

                            EC.element_to_be_clickable((By.XPATH,

                            "/html/body/div[7]/div[1]/input[1]")))

                           

                start_bar.clear()

                start_bar.send_keys(start)




                # Sending the end date

                end_bar = WebDriverWait(driver, 20).until(

                            EC.element_to_be_clickable((By.XPATH,

                            "/html/body/div[7]/div[1]/input[2]")))

                           

                end_bar.clear()

                end_bar.send_keys(end)

              

                # Clicking on the apply button

                apply_button = WebDriverWait(driver,20).until(

                      EC.element_to_be_clickable((By.XPATH,

                      "/html/body/div[7]/div[5]/a")))

               

                apply_button.click()

                sleep(5)

               

                # Getting the tables on the page and quiting

                dataframes = pd.read_html(driver.page_source)

                driver.quit()

                print(f'{currency} scraped.')

                break

           

            except:

                driver.quit()

                print(f'Failed to scrape {currency}. Trying again in 30 seconds.')

                sleep(30)

                Continue

For each DataFrame in this dataframes list, we will check if the name matches, Now we will append this dataframe to the list we assigned in the beginning.

Then we will need to export a csv file. This will be the last step and then we will be over with the extraction.

How to Code a Scraping Bot with Selenium and Python 5

Wrapping up:

This is all about extracting the data from the website.So far this code gets the historical data of the exchange rate of a list of currencies against the dollar and returns a list of DataFrames and several .csv files.

https://www.investing.com/currencies/usd-eur-historical-data

How to web scrape with Python in 4 minutes

Web Scraping:

Web scraping is used to extract the data from the website and it can save time as well as effort. In this article, we will be extracting hundreds of file from the New York MTA. Some people find web scraping tough, but it is not the case as this article will break the steps into easier ones to get you comfortable with web scraping.

New York MTA Data:

We will download the data from the below website:

http://web.mta.info/developers/turnstile.html

Turnstile data is compiled every week from May 2010 till now, so there are many files that exist on this site. For instance, below is an example of what data looks like.

You can right-click on the link and can save it to your desktop. That is web scraping!

Important Notes about Web scraping:

  1. Read through the website’s Terms and Conditions to understand how you can legally use the data. Most sites prohibit you from using the data for commercial purposes.
  2. Make sure you are not downloading data at too rapid a rate because this may break the website. You may potentially be blocked from the site as well.

Inspecting the website:

The first thing that we should find out is the information contained in the HTML tag from where we want to scrape it. As we know, there is a lot of code on the entire page and it contains multiple HTML tags, so we have to find out the one which we want to scrape and write it down in our code so that all the data related to it will be visible.

When you are on the website, right-click and then when you will scroll down you will get an option of “inspect”. Click on it and see the hidden code behind the page.

You can see the arrow symbol at the top of the console. 

If you will click on the arrow and then click any text or item on the website then the highlighted tag will appear related to the website on which you clicked.

I clicked on Saturday, September 2018 file and the console came in the blue highlighted part.

<a href=”data/nyct/turnstile/turnstile_180922.txt”>Saturday, September 22, 2018</a>

You will see that all the .txt files come in <a> tags. <a> tags are used for hyperlinks.

Now that we got the location, we will process the coding!

Python Code:

The first and foremost step is importing the libraries:

import requests

import urllib.request

import time

from bs4 import BeautifulSoup

Now we have to set the url and access the website:

url = '

http://web.mta.info/developers/turnstile.html’

response = requests.get(url)

Now, we can use the features of beautiful soup for scraping.

soup = BeautifulSoup(response.text, “html.parser”)

We will use the method findAll to get all the <a> tags.

soup.findAll('a')

This function will give us all the <a> tags.

Now, we will extract the actual link that we want.

one_a_tag = soup.findAll(‘a’)[38]

link = one_a_tag[‘href’]

This code will save the first .txt file to our variable link.

download_url = 'http://web.mta.info/developers/'+ link

urllib.request.urlretrieve(download_url,'./'+link[link.find('/turnstile_')+1:])

For pausing our code we will use the sleep function.

time.sleep(1)

To download the entire data we have to apply them for a loop. I am attaching the entire code so that you won’t face any problem.

I hope you understood the concept of web scraping.

Enjoy reading and have fun while scraping!

An Intro to Web Scraping with lxml and Python:

Sometimes we want that data from the API which cannot be accessed using it. Then, in the absence of API, the only choice left is to make a web scraper. The task of the scraper is to scrape all the information which we want in easily and in very little time.

The example of a typical API response in JSON. This is the response from Reddit.

 There are various kinds of python libraries that help in web scraping namely scrapy, lxml, and beautiful soup.

Many articles explain how to use beautiful soup and scrapy but I will be focusing on lxml. I will teach you how to use XPaths and how to use them to extract data from HTML documents.

Getting the data:

If you are into gaming, then you must be familiar with this website steam.

We will be extracting the data from the “popular new release” information.

Now, right-click on the website and you will see the inspect option. Click on it and select the HTML tag.

We want an anchor tag because every list is encapsulated in the <a> tag.

The anchor tag lies in the div tag with an id of tag_newreleasecontent. We are mentioning the id because there are two tabs on this page and we only want the information of popular release data.

Now, create your python file and start coding. You can name the file according to your preference. Start importing the below libraries:

import requests 

import lxml.html

If you don’t have requests to install then type the below code on your terminal:

$ pip install requests

Requests module helps us open the webpage in python.

Extracting and processing the information:

Now, let’s open the web page using the requests and pass that response to lxml.html.fromstring.

html = requests.get('https://store.steampowered.com/explore/new/') 

doc = lxml.html.fromstring(html.content)

This provides us with a structured way to extract information from an HTML document. Now we will be writing an XPath for extracting the div which contains the” popular release’ tab.

new_releases = doc.xpath('//div[@id="tab_newreleases_content"]')[0]

We are taking only one element ([0]) and that would be our required div. Let us break down the path and understand it.

  • // these tell lxml that we want to search for all tags in the HTML document which match our requirements.
  • Div tells lxml that we want to find div tags.
  • @id=”tab_newreleases_content tells the div tag that we are only interested in the id which contains tab_newrelease_content.

Awesome! Now we understand what it means so let’s go back to inspect and check under which tag the title lies.

The title name lies in the div tag inside the class tag_item_name. Now we will run the XPath queries to get the title name.

titles = new_releases.xpath('.//div[@class="tab_item_name"]/text()')







We can see that the names of the popular releases came. Now, we will extract the price by writing the following code:

prices = new_releases.xpath('.//div[@class="discount_final_price"]/text()')

Now, we can see that the prices are also scraped. We will extract the tags by writing the following command:

tags = new_releases.xpath('.//div[@class="tab_item_top_tags"]')

total_tags = []

for tag in tags:

total_tags.append(tag.text_content())

We are extracting the div containing the tags for the game. Then we loop over the list of extracted tags using the tag.text_content method.

Now, the only thing remaining is to extract the platforms associated with each title. Here is the the HTML markup:

The major difference here is that platforms are not contained as texts within a specific tag. They are listed as class name so some titles only have one platform associated with them:

 

<span class="platform_img win">&lt;/span>

 

While others have 5 platforms like this:

 

<span class="platform_img win"></span><span class="platform_img mac"></span><span class="platform_img linux"></span><span class="platform_img hmd_separator"></span> <span title="HTC Vive" class="platform_img htcvive"></span> <span title="Oculus Rift" class="platform_img oculusrift"></span>

The span tag contains platform types as the class name. The only thing common between them is they all contain platform_img class.

First of all, we have to extract the div tags containing the tab_item_details class. Then we will extract the span containing the platform_img class. Lastly, we will extract the second class name from those spans. Refer to the below code:

platforms_div = new_releases.xpath('.//div[@class="tab_item_details"]')

total_platforms = []

for game in platforms_div:    

temp = game.xpath('.//span[contains(@class, "platform_img")]')    

platforms = [t.get('class').split(' ')[-1] for t in temp]    

if 'hmd_separator' in platforms:        

platforms.remove('hmd_separator')   

 total_platforms.append(platforms)

Now we just need this to return a JSON response so that we can easily turn this into Flask based API.

output = []for info in zip(titles,prices, tags, total_platforms):    resp = {}    

resp['title'] = info[0]

resp['price'] = info[1]    

resp['tags'] = info[2]    

resp['platforms'] = info[3]    

output.append(resp)

We are using the zip function to loop over all of the lists in parallel. Then we create a dictionary for each game to assign the game name, price, and platforms as keys in the dictionary.

Wrapping up:

I hope this article is understandable and you find the coding easy.

Enjoy reading!