Python Pandas Series var() Function

Pandas Series var() Function:

The var() function of the Pandas Series gets the unbiased variance values along the chosen axis.

Syntax:

Series.var(axis=None, skipna=None, level=None, ddof=1, numeric_only=None)

Parameters

axis: This is optional. It indicates 0 or ‘index’. This is the axis on which the function will be applied.

skipna: This is optional. When computing the result, specify True to exclude NA/null values. The default value is True.

level: This is optional. It indicates the level (int or str). If the axis is a MultiIndex (hierarchical), count along a particular level, collapsing into a scalar. The level name is specified by str.

ddof: This is optional. This indicates the Delta Degrees of Freedom. In computations, the divisor used is N – ddof, where N indicates the number of elements.

numeric_only: This is optional. Pass True to include just float, int, or boolean data. False by default

Return Value:

If a level is given, it returns a scalar or a series. The unbiased variance of the values along the given axis is returned by the var() function of the Pandas Series.

Pandas Series var() Function in Python:

Example1

Approach:

  • Import pandas module using the import keyword.
  • Import numpy module using the import keyword.
  • Give the category(level) values as arguments list to from_arrays() functions
  • Pass some random list, index values from the above and name as Numbers as the arguments to the Series() function of the pandas module to create a series.
  • Store it in a variable.
  • Print the above-given series
  • Printing the variance of all elements in the given series using the var() function
  • Printing the variance of each level of the series using level=’DataType’
  • Printing the variance of each level of the series using level=0.
  • The Exit of the Program.

Below is the implementation:

# Import pandas module using the import keyword.
import pandas as pd
# Import numpy module using the import keyword.
import numpy as np
# Give the category(level) values as arguments list to from_arrays() functions
gvn_indx = pd.MultiIndex.from_arrays([
    ['positive', 'negative', 'positive', 
     'positive', 'negative', 'negative']],
    names=['DataType'])
# Pass some random list, index values from the above and name as Numbers
# as the arguments to the Series() function of the pandas module to create a series.
# Store it in a variable.
gvn_series = pd.Series([12, 3, 4, 5, 1, 2], 
              name='Numbers', index=gvn_indx)
# Print the above given series
print("The given series is:")
print(gvn_series)
print()

# Printing the variance of all elements in the given series 
# using the var() function
print("The variance of all elements in the given series:")
print(gvn_series.var())
print()
# Printing the variance of each level of the series using level='DataType'
print("The variance of all level values using level='DataType':")
print(gvn_series.var(level='DataType'))
print()
# Printing the variance of each level of the series using level=0
print("The variance of all level values using level=0:")
print(gvn_series.var(level=0))

Output:

The given series is:
DataType
positive    12
negative     3
positive     4
positive     5
negative     1
negative     2
Name: Numbers, dtype: int64

The variance of all elements in the given series:
15.5

The variance of all level values using level='DataType':
DataType
positive    19.0
negative     1.0
Name: Numbers, dtype: float64

The variance of all level values using level=0:
DataType
positive    19.0
negative     1.0
Name: Numbers, dtype: float64

Example2

Here, the var() function is used on a specific series/column in a DataFrame.

Approach:

  • Import pandas module using the import keyword.
  • Pass some random key-value pair(dictionary), index list as arguments to the DataFrame() function of the pandas module to create a dataframe.
  • Store it in a variable.
  • Print the given dataframe.
  • Apply var() function on the student_marks column of the dataframe to get the unbiased variance of all the values of the student_marks column and print the result.
  • The Exit of the Program.

Below is the implementation:

# Import pandas module using the import keyword.
import pandas as pd
# Pass some random key-value pair(dictionary), index list as arguments to the 
# DataFrame() function of the pandas module to create a dataframe
# Store it in a variable.
data_frme = pd.DataFrame({
  "student_rollno": [1, 2, 3, 4],
  "student_marks": [80, 35, 25, 90]},
  index= ["virat", "nick" , "jessy", "sindhu"]
)
# Print the given dataframe
print("The given Dataframe:")
print(data_frme)
print()
# Apply var() function on the student_marks column of the dataframe to
# get the unbiased variance of all the values of the student_marks
# column and print the result.
print("The unbiased variance of student_marks column of the dataframe:")
print(data_frme["student_marks"].var())

Output:

The given Dataframe:
        student_rollno  student_marks
virat                1             80
nick                 2             35
jessy                3             25
sindhu               4             90

The unbiased variance of student_marks column of the dataframe:
1041.6666666666667