How to select rows and columns by Name or Index in DataFrame using loc and iloc in Python ?
We will discuss several methods to select rows and columns in a dataframe. To select rows or columns we can use loc( )
, iloc( )
or using the [ ] operator
.
To demonstrate the various methods we will be using the following dataset :
Name Score City 0 Jill 16.0 Tokyo 1 Rachel 38.0 Texas 2 Kirti 39.0 New York 3 Veena 40.0 Texas 4 Lucifer NaN Texas 5 Pablo 30.0 New York 6 Lionel 45.0 Colombia
Method-1 : DataFrame.loc | Select Column & Rows by Name
We can use the loc( )
function to select rows and columns.
Syntax :
dataFrame.loc[<ROWS RANGE> , <COLUMNS RANGE>]
We have to enter the range of rows or columns, and it will select the specified range.
If we don’t give a value and pass ‘:’ instead, it will select all the rows or columns.
- Append/Add Row to Dataframe in Pandas – dataframe.append() | How to Insert Rows to Pandas Dataframe?
- Get Rows And Columns Names In Dataframe Using Python
- Pandas: Delete last column of dataframe in python | How to Remove last column from Dataframe in Python?
Select a Column by Name in DataFrame using loc[ ] :
As we need to select a single column only, we have to pass ‘:’
in row range place.
So, let’s see the implementation of it.
#Program : import pandas as pd import numpy as np #data students = [ ('Jill', 16, 'Tokyo',), ('Rachel', 38, 'Texas',), ('Kirti', 39, 'New York'), ('Veena', 40, 'Texas',), ('Lucifer', np.NaN, 'Texas'), ('Pablo', 30, 'New York'), ('Lionel', 45, 'Colombia',)] #Creating the dataframe object dfObj = pd.DataFrame(students, columns=['Name','Score','City']) #Selecting the 'Score solumn' columnD = dfObj.loc[:,'Score'] print(columnD)
Output : 0 16.0 1 38.0 2 39.0 3 40.0 4 NaN 5 30.0 6 45.0 Name: Score, dtype: float64
Select multiple Columns by Name in DataFrame using loc[ ] :
To select multiple columns, we have to pass the column names as a list into the function.
So, let’s see the implementation of it.
#Program import pandas as pd import numpy as np #data students = [('Jill', 16, 'Tokyo',), ('Rachel', 38, 'Texas',), ('Kirti', 39, 'New York'), ('Veena', 40, 'Texas',), ('Lucifer', np.NaN, 'Texas'), ('Pablo', 30, 'New York'), ('Lionel', 45, 'Colombia',)] #Creating the dataframe object dfObj = pd.DataFrame(students, columns=['Name','Score','City'], index=['a','b','c','d','e','f','g']) #Selecting multiple columns i.e 'Name' and 'Score' column columnD = dfObj.loc[:,['Name','Score']] print(columnD)
Output : Name Score a Jill 16.0 b Rachel 38.0 c Kirti 39.0 d Veena 40.0 e Lucifer NaN f Pablo 30.0 g Lionel 45.0
Select a single row by Index Label in DataFrame using loc[ ] :
Just like the column, we can also select a single row by passing its name and in place of column range passing ‘:’
.
So, let’s see the implementation of it.
#Program : import pandas as pd import numpy as np #data students = [('Jill', 16, 'Tokyo',), ('Rachel', 38, 'Texas',), ('Kirti', 39, 'New York'), ('Veena', 40, 'Texas',), ('Lucifer', np.NaN, 'Texas'), ('Pablo', 30, 'New York'), ('Lionel', 45, 'Colombia',)] #Creating the dataframe object dfObj = pd.DataFrame(students, columns=['Name','Score','City'], index=['a','b','c','d','e','f','g']) #Selecting a single row i.e 'b' row selectData = dfObj.loc['b',:] print(selectData)
Output : Name Rachel Score 38.0 City Texas Name: b, dtype: object
Select multiple rows by Index labels in DataFrame using loc[ ] :
To select multiple rows we have to pass the names as a list into the function.
So, let’s see the implementation of it.
#Program : import pandas as pd import numpy as np #data students = [('Jill', 16, 'Tokyo',), ('Rachel', 38, 'Texas',), ('Kirti', 39, 'New York'), ('Veena', 40, 'Texas',), ('Lucifer', np.NaN, 'Texas'), ('Pablo', 30, 'New York'), ('Lionel', 45, 'Colombia',)] #Creating the dataframe object dfObj = pd.DataFrame(students, columns=['Name','Score','City'], index=['a','b','c','d','e','f','g']) #Selecting multiple rows i.e 'd' and 'g' selectData = dfObj.loc[['d','g'],:] print(selectData)
Output : Name Score City d Veena 40.0 Texas g Lionel 45.0 Colombia
Select multiple row & columns by Labels in DataFrame using loc[ ] :
To select multiple rows and columns we have to pass the list of rows and columns we want to select into the function.
So, let’s see the implementation of it.
#Program : import pandas as pd import numpy as np #data students = [('Jill', 16, 'Tokyo',), ('Rachel', 38, 'Texas',), ('Kirti', 39, 'New York'), ('Veena', 40, 'Texas',), ('Lucifer', np.NaN, 'Texas'), ('Pablo', 30, 'New York'), ('Lionel', 45, 'Colombia',)] #Creating the dataframe object dfObj = pd.DataFrame(students, columns=['Name','Score','City'], index=['a','b','c','d','e','f','g']) #Selecting multiple rows and columns i.e 'd' and 'g' rows and 'Name' , 'City' column selectData = dfObj.loc[['d','g'],['Name','City']] print(selectData)
Output : Name City d Veena Texas g Lionel Colombia
Method-2 : DataFrame.iloc | Select Column Indexes & Rows Index Positions
We can use the iloc( )
function to select rows and columns. It is quite similar to loc( )
function .
Syntax-
dataFrame.iloc
[<ROWS INDEX RANGE> , <COLUMNS INDEX RANGE>]
The function selects rows and columns in the dataframe by the index position we pass into the program. And just as like loc( ) if ‘:
’ is passed into the function, all the rows/columns are selected.
Select a single column by Index position :
We have to pass the index of the column with ‘:’
in place of the row index.
So, let’s see the implementation of it.
#Program : import pandas as pd import numpy as np #data students = [ ('Jill', 16, 'Tokyo',), ('Rachel', 38, 'Texas',), ('Kirti', 39, 'New York'), ('Veena', 40, 'Texas',), ('Lucifer', np.NaN, 'Texas'), ('Pablo', 30, 'New York'), ('Lionel', 45, 'Colombia',)] #Creating the dataframe object dfObj = pd.DataFrame(students, columns=['Name','Score','City'], index=['a','b','c','d','e','f','g']) #Selecting a single column at the index 2 selectData = dfObj.iloc[:,2] print(selectData)
Output : a Tokyo b Texas c New York d Texas e Texas f New York g Colombia Name: City, dtype: object
Select multiple columns by Indices in a list :
To select multiple columns by indices we just pass the indices as series into the column value.
So, let’s see the implementation of it.
#Program : import pandas as pd import numpy as np #data students = [ ('Jill', 16, 'Tokyo',), ('Rachel', 38, 'Texas',), ('Kirti', 39, 'New York'), ('Veena', 40, 'Texas',), ('Lucifer', np.NaN, 'Texas'), ('Pablo', 30, 'New York'), ('Lionel', 45, 'Colombia',)] #Creating the dataframe object dfObj = pd.DataFrame(students, columns=['Name','Score','City'], index=['a','b','c','d','e','f','g']) #Selecting multiple columns at the index 0 & 2 selectData = dfObj.iloc[:,[0,2]] print(selectData)
Output : Name City a Jill Tokyo b Rachel Texas c Kirti New York d Veena Texas e Lucifer Texas f Pablo New York g Lionel Colombia
Select multiple columns by Index range :
To select multiple columns by index range we just pass the indices as series into the column value.
So, let’s see the implementation of it.
#Program : import pandas as pd import numpy as np #data students = [ ('Jill', 16, 'Tokyo',), ('Rachel', 38, 'Texas',), ('Kirti', 39, 'New York'), ('Veena', 40, 'Texas',), ('Lucifer', np.NaN, 'Texas'), ('Pablo', 30, 'New York'), ('Lionel', 45, 'Colombia',)] #Creating the dataframe object dfObj = pd.DataFrame(students, columns=['Name','Score','City'], index=['a','b','c','d','e','f','g']) #Selecting multiple columns from the index 1 to 3 selectData = dfObj.iloc[:,1:3] print(selectData)
Output : Score City a 16.0 Tokyo b 38.0 Texas c 39.0 New York d 40.0 Texas e NaN Texas f 30.0 New York g 45.0 Colombia
Select single row by Index Position :
Just like columns we can pass the index and select the row.
So, let’s see the implementation of it.
#Program : import pandas as pd import numpy as np #data students = [ ('Jill', 16, 'Tokyo',), ('Rachel', 38, 'Texas',), ('Kirti', 39, 'New York'), ('Veena', 40, 'Texas',), ('Lucifer', np.NaN, 'Texas'), ('Pablo', 30, 'New York'), ('Lionel', 45, 'Colombia',)] #Creating the dataframe object dfObj = pd.DataFrame(students, columns=['Name','Score','City'], index=['a','b','c','d','e','f','g']) #Selecting a single row with index 2 selectData = dfObj.iloc[2,:] print(selectData)
Output : Name Kirti Score 39.0 City New York Name: c, dtype: object
Select multiple rows by Index positions in a list :
To do this we can pass the indices of positions to select into the function.
So, let’s see the implementation of it.
#Program : import pandas as pd import numpy as np #data students = [ ('Jill', 16, 'Tokyo',), ('Rachel', 38, 'Texas',), ('Kirti', 39, 'New York'), ('Veena', 40, 'Texas',), ('Lucifer', np.NaN, 'Texas'), ('Pablo', 30, 'New York'), ('Lionel', 45, 'Colombia',)] #Creating the dataframe object dfObj = pd.DataFrame(students, columns=['Name','Score','City'], index=['a','b','c','d','e','f','g']) #Selecting multiple rows by passing alist i.e. 2 & 5 selectData = dfObj.iloc[[2,5],:] print(selectData)
Output : Name Score City c Kirti 39.0 New York f Pablo 30.0 New York
Select multiple rows by Index range :
To select a range of rows we pass the range separated by a ‘:’ into the function.
So, let’s see the implementation of it.
#Program : import pandas as pd import numpy as np #data students = [ ('Jill', 16, 'Tokyo',), ('Rachel', 38, 'Texas',), ('Kirti', 39, 'New York'), ('Veena', 40, 'Texas',), ('Lucifer', np.NaN, 'Texas'), ('Pablo', 30, 'New York'), ('Lionel', 45, 'Colombia',)] #Creating the dataframe object dfObj = pd.DataFrame(students, columns=['Name','Score','City'], index=['a','b','c','d','e','f','g']) #Selecting multiple rows by range i.e. 2 to 5 selectData = dfObj.iloc[2:5,:] print(selectData)
Output : Name Score City c Kirti 39.0 New York d Veena 40.0 Texas e Lucifer NaN Texas
Select multiple rows & columns by Index positions :
To select multiple rows and columns at once, we pass the indices directly into function.
So, let’s see the implementation of it.
#Program : import pandas as pd import numpy as np #data students = [ ('Jill', 16, 'Tokyo',), ('Rachel', 38, 'Texas',), ('Kirti', 39, 'New York'), ('Veena', 40, 'Texas',), ('Lucifer', np.NaN, 'Texas'), ('Pablo', 30, 'New York'), ('Lionel', 45, 'Colombia',)] #Creating the dataframe object dfObj = pd.DataFrame(students, columns=['Name','Score','City'], index=['a','b','c','d','e','f','g']) #Selecting multiple rows and columns selectData = dfObj.iloc[[1,2],[1,2]] print(selectData)
Output : Score City b 38.0 Texas c 39.0 New York
Method-3 : Selecting Columns in DataFrame using [ ] operator
The [ ]
operator selects the data according to the name provided to it. However, when a non-existent label is passed into it, it sends a KeyError
.
Select a Column by Name :
So, let’s see the implementation of it.
#Program : import pandas as pd import numpy as np #data students = [ ('Jill', 16, 'Tokyo',), ('Rachel', 38, 'Texas',), ('Kirti', 39, 'New York'), ('Veena', 40, 'Texas',), ('Lucifer', np.NaN, 'Texas'), ('Pablo', 30, 'New York'), ('Lionel', 45, 'Colombia',)] #Creating the dataframe object dfObj = pd.DataFrame(students, columns=['Name','Score','City'], index=['a','b','c','d','e','f','g']) #Select a single column name using [ ] selectData = dfObj['Name'] print(selectData)
Output : a Jill b Rachel c Kirti d Veena e Lucifer f Pablo g Lionel Name: Name, dtype: object
Select multiple columns by Name :
To select multiple columns we just pass a list of their names into [ ]
.
So, let’s see the implementation of it.
#Program : import pandas as pd import numpy as np #data students = [ ('Jill', 16, 'Tokyo',), ('Rachel', 38, 'Texas',), ('Kirti', 39, 'New York'), ('Veena', 40, 'Texas',), ('Lucifer', np.NaN, 'Texas'), ('Pablo', 30, 'New York'), ('Lionel', 45, 'Colombia',)] #Creating the dataframe object dfObj = pd.DataFrame(students, columns=['Name','Score','City'], index=['a','b','c','d','e','f','g']) #Select multiple columns using [ ] selectData = dfObj[['Name','City']] print(selectData)
Output : Name City a Jill Tokyo b Rachel Texas c Kirti New York d Veena Texas e Lucifer Texas f Pablo New York g Lionel Colombia
Want to expert in the python programming language? Exploring Python Data Analysis using Pandas tutorial changes your knowledge from basic to advance level in python concepts.
Read more Articles on Python Data Analysis Using Padas – Select items from a Dataframe
- Select Rows in a Dataframe based on conditions
- Get minimum values in rows or columns & their index position in Dataframe
- Get unique values in columns of a Dataframe
- Select first or last N rows in a Dataframe using head() & tail()
- Get a list of column and row names in a DataFrame
- Get DataFrame contents as a list of rows or columns (list of lists)