About Pandas DataFrame
Pandas select rows by multiple conditions: It is a 2 dimensional data structure, like a 2 dimensional array, or a table with rows and columns.
This article is all about showing different ways to select rows in DataFrame based on condition on single or multiple columns.
import pandas as pd students = [ ('Shyam', 'books' , 24) , ('ankur', 'pencil' , 28) , ('Rekha', 'pen' , 30) , ('Sarika', 'books', 62) , ('Lata', 'file' , 33) , ('Mayank', 'pencil' , 30) ] dataframeobj = pd.DataFrame(students, columns = ['Name' , 'Product', 'Sale']) print(dataframeobj)
Output will be:
RESTART: C:/Users/HP/Desktop/dataframe.py Name Product Sale 0 Shyam books 24 1 Ankur pencil 28 2 Rekha pen 30 3 Sarika books 62 4 Lata file 33 5 Mayank pencil 30
- Select Rows & Columns by Name or Index in DataFrame using loc & iloc | Python Pandas
- Get Rows And Columns Names In Dataframe Using Python
- Python Pandas : Drop columns in DataFrame by label Names or by Index Positions
Select Rows based on value in column
Pandas select rows multiple conditions: Let’s see how to Select rows based on some conditions in DataFrame.
Select rows in above example for which ‘Product’ column contains the value ‘books’,
import pandas as pd students = [ ('Shyam', 'books' , 24) , ('ankur', 'pencil' , 28) , ('Rekha', 'pen' , 30) , ('Sarika', 'books', 62) , ('Lata', 'file' , 33) , ('Mayank', 'pencil' , 30) ] dataframeobj = pd.DataFrame(students, columns = ['Name' , 'Product', 'Sale']) subsetDataFrame = dataframeobj[dataframeobj['Product'] == 'books'] print(subsetDataFrame)
Output:
RESTART: C:/Users/HP/Desktop/dataframe.py Name Product Sale 0 Shyam books 24 3 Sarika books 62
In above example we have seen that subsetDataFrame = dataframeobj[dataframeobj['Product'] == 'books']
using this it will return column which have ‘Product’ contains ‘Books’ only.
So,if we want to see whole functionality?See below.
When we apply [dataframeobj['Product'] == 'books']
this condition,it will give output in true & false form.
0 True 1 False 2 False 3 True 4 False 5 False Name: Product, dtype: bool
It will give true when the condition matches otherwise false.
If we pass this series object to [] operator of DataFrame, then it will be return a new DataFrame with only those rows that has True in the passed Series object i.e.
RESTART: C:/Users/HP/Desktop/dataframe.py
Name Product Sale 0 Shyam books 24 3 Sarika books 62
If we select any other product name it will return value accordingly.
Select Rows based on any of the multiple values in column
Python dataframe select rows by condition: Select rows from above example for which ‘Product‘ column contains either ‘Pen‘ or ‘Pencil‘ i.e
import pandas as pd students = [ ('Shyam', 'books' , 24) , ('ankur', 'pencil' , 28) , ('Rekha', 'pen' , 30) , ('Sarika', 'books', 62) , ('Lata', 'file' , 33) , ('Mayank', 'pencil' , 30) ] dataframeobj = pd.DataFrame(students, columns = ['Name' , 'Product', 'Sale']) subsetDataFrame = dataframeobj[dataframeobj['Product'].isin(['pen', 'pencil']) ] print(subsetDataFrame)
We have given product name list by isin() function and it will return true if condition will match otherwise false.
Therefore, it will return a DataFrame in which Column ‘Product‘ contains either ‘Pen‘ or ‘Pencil‘ only i.e.
Output:
RESTART: C:/Users/HP/Desktop/dataframe.py Name Product Sale 1 ankur pencil 28 2 Rekha pen 30 5 Mayank pencil 30
Select DataFrame Rows Based on multiple conditions on columns
In this method we are going to select rows in above example for which ‘Sale’ column contains value greater than 20 & less than 33.So for this we are going to give some condition.
import pandas as pd students = [ ('Shyam', 'books' , 24) , ('ankur', 'pencil' , 28) , ('Rekha', 'pen' , 30) , ('Sarika', 'books', 62) , ('Lata', 'file' , 33) , ('Mayank', 'pencil' , 30) ] dataframeobj = pd.DataFrame(students, columns = ['Name' , 'Product', 'Sale']) filterinfDataframe = dataframeobj[(dataframeobj['Sale'] > 20) & (dataframeobj['Sale'] < 33) ] print(filterinfDataframe)
It will return following DataFrame object in which Sales column contains value between 20 to 33,
RESTART: C:/Users/HP/Desktop/dataframe.py Name Product Sale 0 Shyam books 24 1 ankur pencil 28 2 Rekha pen 30 5 Mayank pencil 30
Conclusion:
In this article we have seen diferent methods to select rows in dataframe by giving some condition.Hope you find this informative.