Diff pandas – Python Pandas Series diff() Function

Pandas Series diff() Function:

Diff pandas: The diff() function of the Pandas Series gives the difference of a Series element compared with another element in the Series. The element in the previous row is the default.

Syntax:

Series.diff(periods=1)

Parameters

periods: This is Optional. This is the period for computing the difference. Negative values can also be used. 1 is the default.

Return Value:

The Series with the first (or given period) discrete difference of element is returned by the diff() function of the Pandas Series.

Pandas Series diff() Function in Python

Example1

Approach:

  • Import pandas module using the import keyword.
  • Pass some random list as an argument to the Series() function of the pandas module to create a series.
  • Store it in a variable.
  • Print the above-given series.
  • Apply diff() function on the given series to get the first discrete difference of elements present in the given series and print the result.
  • Apply diff() function on the given series by passing 2 as an argument to it to get the second discrete difference of elements present in the given series and print the result.
  • The Exit of the Program.

Below is the implementation:

# Import pandas module using the import keyword.
import pandas as pd
# Pass some random list as an argument to the Series() function
# of the pandas module to create a series.
# Store it in a variable.
gvn_series = pd.Series([3.2, 4, 1.3, 5, 7.5])
# Print the above given series
print("The given series is:")
print(gvn_series)
print()
# Apply diff() function on the given series to get the 
# first discrete difference of elements present in the given series
# and print the result.
print("The first discrete difference of elements present in the given series:")
print(gvn_series.diff())
print()
# Apply diff() function on the given series by passing 2 as an argument to it
# to get the second discrete difference of elements present in the given series
# and print the result.
print("The second discrete difference of elements present in the given series:")
print(gvn_series.diff(2))

Output:

The given series is:
0    3.2
1    4.0
2    1.3
3    5.0
4    7.5
dtype: float64

The first discrete difference of elements present in the given series:
0    NaN
1    0.8
2   -2.7
3    3.7
4    2.5
dtype: float64

The second discrete difference of elements present in the given series:
0    NaN
1    NaN
2   -1.9
3    1.0
4    6.2
dtype: float64

Example2

Here, the diff() function is used on a specified series/columns of a DataFrame.

Approach:

  • Import pandas module using the import keyword.
  • Pass some random key-value pair(dictionary), index list as arguments to the DataFrame() function of the pandas module to create a dataframe.
  • Store it in a variable.
  • Print the given dataframe.
  • Get the first discrete difference of elements present in the student_marks column of the dataframe using the diff() function and print the result
  • The Exit of the Program.

Below is the implementation:

# Import pandas module using the import keyword.
import pandas as pd
# Pass some random key-value pair(dictionary), index list as arguments to the 
# DataFrame() function of the pandas module to create a dataframe
# Store it in a variable.
data_frme = pd.DataFrame({
  "student_rollno": [1, 2, 3, 4, 5, 6],
  "student_marks": [75, 35, 25, 90, 80, 85]},
  index= ["virat", "nick" , "jessy", "sindhu", "john", "mary"]
)
# Print the given dataframe
print("The given Dataframe:")
print(data_frme)
print()
# Get the first discrete difference of elements present in the student_marks 
# column of the dataframe using the diff() function and print the result
print("The first discrete difference of elements in student_marks column of the dataframe:")
print(data_frme['student_marks'].diff())

Output:

The given Dataframe:
        student_rollno  student_marks
virat                1             75
nick                 2             35
jessy                3             25
sindhu               4             90
john                 5             80
mary                 6             85

The first discrete difference of elements in student_marks column of the dataframe:
virat      NaN
nick     -40.0
jessy    -10.0
sindhu    65.0
john     -10.0
mary       5.0
Name: student_marks, dtype: float64