Pandas : Loop or Iterate over all or certain columns of a dataframe

In this article, we will discuss how to loop or Iterate overall or certain columns of a DataFrame. Also, you may learn and understand what is dataframe and how pandas dataframe iterate over columns with the help of great explanations and example codes.

About DataFrame

A Pandas DataFrame is a 2-dimensional data structure, like a 2-dimensional array, or a table with rows and columns.

First, we are going to create a dataframe that will use in our article.

import pandas as pd

employees = [('Abhishek', 34, 'Sydney') ,
           ('Sumit', 31, 'Delhi') ,
           ('Sampad', 16, 'New York') ,
           ('Shikha', 32,'Delhi') ,
            ]

#load data into a DataFrame object:
df = pd.DataFrame(employees, columns=['Name', 'Age', 'City'], index=['a', 'b', 'c', 'd'])

print(df)

Output:

    Name       Age    City
a  Abhishek  34    Sydney
b  Sumit       31    Delhi
c  Sampad   16     New York
d  Shikha     32     Delhi

Also Check:

Using DataFrame.iteritems()

We are going to iterate columns of a dataframe using DataFrame.iteritems().

Dataframe class provides a member function iteritems().

import pandas as pd
employees = [('Abhishek', 34, 'Sydney') ,
             ('Sumit', 31, 'Delhi') ,
             ('Sampad', 16, 'New York') ,
             ('Shikha', 32,'Delhi') , ]
#load data into a DataFrame object:
df = pd.DataFrame(employees, columns=['Name', 'Age', 'City'], index=['a', 'b', 'c', 'd'])
# Yields a tuple of column name and series for each column in the dataframe
for (columnName, columnData) in df.iteritems():
   print('Colunm Name : ', columnName)
   print('Column Contents : ', columnData.values)

Output:

Colunm Name : Name
Column Contents : ['Abhishek' 'Sumit' 'Sampad' 'Shikha']
Colunm Name : Age
Column Contents : [34 31 16 32]
Colunm Name : City
Column Contents : ['Sydney' 'Delhi' 'New York' 'Delhi']

In the above example, we have to return an iterator that can be used to iterate over all the columns. For each column, it returns a tuple containing the column name and column contents.

Iterate over columns in dataframe using Column Names

import pandas as pd
employees = [('Abhishek', 34, 'Sydney') ,
             ('Sumit', 31, 'Delhi') ,
             ('Sampad', 16, 'New York') ,
             ('Shikha', 32,'Delhi') , ]
#load data into a DataFrame object:
df = pd.DataFrame(employees, columns=['Name', 'Age', 'City'], index=['a', 'b', 'c', 'd'])
# Yields a tuple of column name and series for each column in the dataframe
for column in df:
   # Select column contents by column name using [] operator
   columnSeriesObj = df[column]
   print('Colunm Name : ', column)
   print('Column Contents : ', columnSeriesObj.values)

Output:

Colunm Name : Name
Column Contents : ['Abhishek' 'Sumit' 'Sampad' 'Shikha']
Colunm Name : Age
Column Contents : [34 31 16 32]
Colunm Name : City
Column Contents : ['Sydney' 'Delhi' 'New York' 'Delhi']

ln the above example, we can see that Dataframe.columns returns a sequence of column names on which we put iteration and return column name and content.

Iterate Over columns in dataframe in reverse order

import pandas as pd
employees = [('Abhishek', 34, 'Sydney') ,
             ('Sumit', 31, 'Delhi') ,
             ('Sampad', 16, 'New York') ,
             ('Shikha', 32,'Delhi') , ]
#load data into a DataFrame object:
df = pd.DataFrame(employees, columns=['Name', 'Age', 'City'], index=['a', 'b', 'c', 'd'])
# Yields a tuple of column name and series for each column in the dataframe
for column in reversed(df.columns):
   # Select column contents by column name using [] operator
   columnSeriesObj = df[column]
   print('Colunm Name : ', column)
   print('Column Contents : ', columnSeriesObj.values)

Output:

Colunm Name : City
Column Contents : ['Sydney' 'Delhi' 'New York' 'Delhi']
Colunm Name : Age
Column Contents : [34 31 16 32]
Colunm Name : Name
Column Contents : ['Abhishek' 'Sumit' 'Sampad' 'Shikha']

We have used reversed(df.columns)which given us the reverse column name and its content.

Iterate Over columns in dataframe by index using iloc[]

import pandas as pd
employees = [('Abhishek', 34, 'Sydney') ,
             ('Sumit', 31, 'Delhi') ,
             ('Sampad', 16, 'New York') ,
             ('Shikha', 32,'Delhi') , ]
#load data into a DataFrame object:
df = pd.DataFrame(employees, columns=['Name', 'Age', 'City'], index=['a', 'b', 'c', 'd'])
# Yields a tuple of column name and series for each column in the dataframe
for index in range(df.shape[1]):
   print('Column Number : ', index)
   # Select column by index position using iloc[]
   columnSeriesObj = df.iloc[: , index]
   print('Column Contents : ', columnSeriesObj.values)

Output:

Column Number : 0
Column Contents : ['Abhishek' 'Sumit' 'Sampad' 'Shikha']
Column Number : 1
Column Contents : [34 31 16 32]
Column Number : 2
Column Contents : ['Sydney' 'Delhi' 'New York' 'Delhi']

So in the above example, you can see that we have iterate over all columns of the dataframe from the 0th index to the last index column. We have selected the contents of the columns using iloc[].

Want to expert in the python programming language? Exploring Python Data Analysis using Pandas tutorial changes your knowledge from basic to advance level in python concepts.

Read more Articles on Python Data Analysis Using Padas

Conclusion:

At last, I can say that the above-explained different methods to iterate over all or certain columns of a dataframe. aids you a lot in understanding the Pandas: Loop or Iterate over all or certain columns of a dataframe. Thank you!