How to apply a function to each row or column in Dataframe in Python.
Python pandas apply function to column: To apply a function to each row or column data in a warframe be it lambda, user-defined or a numpy function we have to use a function from Python’s Pandas library. The function belongs to the dataframe class .
Syntax-
DataFrame.apply(func, axis=0, broadcast=None, raw=False, reduce=None, result_type=None, args=(), **kwds)
Arguments :
- Func : It is the function that is to be applied to the rows/columns. It takes series as arguments and also returns series
- Axis : Axis is the axis in which the function is applied to the rows/columns. (default value is 0. If 1 means it applies to all rows, If 0 means it applies to all columns)
- Args : All the arguments passed in a list of tuples.
Apply a lambda function to each row or each column in Dataframe :
Let us consider a lambda function
lambda x : x + 10
Apply a lambda function to each column :
To apply the function to each column we just have to pass the lambda function as argument in the Dataframe.apply( )
function.
#Program : import pandas as pd import numpy as np #list of tuples matrix = [(222, 34, 23), (333, 31, 11), (444, 16, 21), (555, 32, 22), (666, 33, 27), (777, 35, 11) ] #creating an object from Dataframe class dfObj = pd.DataFrame(matrix, columns=list('abc')) #Passing only the lambda function into the Dataframe function so that it gets applied to columns only modMatrix = dfObj.apply(lambda x : x + 10) print("After applying the lambda function to each column in dataframe") print(modMatrix)
Output : After applying the lambda function to each column in dataframe a b c 0 232 44 33 1 343 41 21 2 454 26 31 3 565 42 32 4 676 43 37 5 787 45 21
- Pandas: Find maximum values & position in columns or rows of a Dataframe | How to find the max value of a pandas DataFrame column in Python?
- Pandas Dataframe: Get minimum values in rows or columns & their index position
- Python Pandas Series agg() Function
Apply a lambda function to each row :
To apply the function to each row we just have to add axis=1 and pass it to the lambda function with the lambda function in the Dataframe.apply( )
function like we did in the column.
#Program : import pandas as pd import numpy as np #list of tuples matrix = [(222, 34, 23), (333, 31, 11), (444, 16, 21), (555, 32, 22), (666, 33, 27), (777, 35, 11) ] #creating an object from Dataframe class dfObj = pd.DataFrame(matrix, columns=list('abc')) #Passing only the lambda function into the Dataframe function so that it gets applied to columns only modMatrix = dfObj.apply(lambda x : x + 10) print("After applying the lambda function to each row in dataframe") print(modMatrix)
Output : After applying the lambda function to each row in dataframe a b c 0 227 39 28 1 338 36 16 2 449 21 26 3 560 37 27 4 671 38 32 5 782 40 16
Apply a User Defined function with or without arguments to each row or column of a Dataframe :
For this let us consider a user-defined function that multiplies the values by 2
def doubleData(x):
return x * 2
Apply a user-defined function to each column :
Like we applied the lambda function to each column, similarly we will only pass the function here.
#Program : import pandas as pd import numpy as np # Multiply given value by 2 and returns def doubleData(x): return x * 2 #list of tuples matrix = [(222, 34, 23), (333, 31, 11), (444, 16, 21), (555, 32, 22), (666, 33, 27), (777, 35, 11) ] #creating an object from Dataframe class dfObj = pd.DataFrame(matrix, columns=list('abc')) #Applyin the user defined function doubleData to columns only modMatrix = dfObj.apply(doubleData) print("After applying the user-defined function to each column in dataframe") print(modMatrix)
Output : After applying the user-defined function to each column in dataframe a b c 0 444 68 46 1 666 62 22 2 888 32 42 3 1110 64 44 4 1332 66 54 5 1554 70 22
Apply a user-defined function to each row :
We just have to add axis=1 to the above function to apply it to rows.
#Program : import pandas as pd import numpy as np # Multiply given value by 2 and returns def doubleData(x): return x * 2 #list of tuples matrix = [(222, 34, 23), (333, 31, 11), (444, 16, 21), (555, 32, 22), (666, 33, 27), (777, 35, 11) ] #creating an object from Dataframe class dfObj = pd.DataFrame(matrix, columns=list('abc')) #Applyin the user defined function doubleData to rows only modMatrix = dfObj.apply(doubleData,axis=1) print("After applying the user-defined function to each row in dataframe") print(modMatrix)
Output : After applying the user-defined function to each row in dataframe a b c 0 444 68 46 1 666 62 22 2 888 32 42 3 1110 64 44 4 1332 66 54 5 1554 70 22
Apply a user-defined function to each row or column with arguments :
Let us take a user defined function that takes accepts a series and a number, then returns the series multiplied to the number
#Program : import pandas as pd import numpy as np #Multplies the whole seried with the number and return the series def multiplyData(x, y): return x * y #list of tuples matrix = [(222, 34, 23), (333, 31, 11), (444, 16, 21), (555, 32, 22), (666, 33, 27), (777, 35, 11) ] #creating an object from Dataframe class dfObj = pd.DataFrame(matrix, columns=list('abc')) #Applyin the user defined function with a argument modMatrix = dfObj.apply(multiplyData, args=[4]) print("After applying the user-defined function with argument in dataframe") print(modMatrix)
Output : After applying the user-defined function with argument in dataframe a b c 0 888 136 92 1 1332 124 44 2 1776 64 84 3 2220 128 88 4 2664 132 108 5 3108 140 44
Apply a numpy functions to each row or column of a Dataframe
For this let’s use the numpy function numpy.square( )
. (For columns pass the function directly and for rows add axis=1 and pass)
#Program import pandas as pd import numpy as np #list of tuples matrix = [(222, 34, 23), (333, 31, 11), (444, 16, 21), (555, 32, 22), (666, 33, 27), (777, 35, 11) ] #creating an object from Dataframe class dfObj = pd.DataFrame(matrix, columns=list('abc')) #Applyin the numpy fuction .square() modMatrix = dfObj.apply(np.square) print("After applying the numpy function in dataframe") print(modMatrix)
Output : After applying the numpy function in dataframe a b c 0 49284 1156 529 1 110889 961 121 2 197136 256 441 3 308025 1024 484 4 443556 1089 729 5 603729 1225 121
Apply a Reducing functions to a to each row or column of a Dataframe
We passed a series into the user-defined functions and it also returned a series . However, we can also pass a series and return a single variable. Let’s use numpy.sum( )
for that.
#Program : import pandas as pd import numpy as np #list of tuples matrix = [(222, 34, 23), (333, 31, 11), (444, 16, 21), (555, 32, 22), (666, 33, 27), (777, 35, 11) ] #creating an object from Dataframe class dfObj = pd.DataFrame(matrix, columns=list('abc')) #Applyin the numpy fuction .sum() modMatrix = dfObj.apply(np.sum) print("After applying the numpy function in datframe") print(modMatrix)
Output : After applying the numpy function in datframe a 2997 b 181 c 115 dtype: int64
Want to expert in the python programming language? Exploring Python Data Analysis using Pandas tutorial changes your knowledge from basic to advance level in python concepts.
Read more Articles on Python Data Analysis Using Padas – Modify a Dataframe
- Pandas: Sort rows or columns in Dataframe based on values using Dataframe.sort_values()
- Apply a function to single or selected columns or rows in Dataframe
- Sort a DataFrame based on column names or row index labels using Dataframe.sort_index() in Pandas
- Change data type of single or multiple columns of Dataframe in Python
- Change Column & Row names in DataFrame
- Convert Dataframe column type from string to date time
- Convert Dataframe column into to the Index of Dataframe
- Convert Dataframe indexes into columns