Pandas Series var() Function:
The var() function of the Pandas Series gets the unbiased variance values along the chosen axis.
Syntax:
Series.var(axis=None, skipna=None, level=None, ddof=1, numeric_only=None)
Parameters
axis: This is optional. It indicates 0 or ‘index’. This is the axis on which the function will be applied.
skipna: This is optional. When computing the result, specify True to exclude NA/null values. The default value is True.
level: This is optional. It indicates the level (int or str). If the axis is a MultiIndex (hierarchical), count along a particular level, collapsing into a scalar. The level name is specified by str.
ddof: This is optional. This indicates the Delta Degrees of Freedom. In computations, the divisor used is N – ddof, where N indicates the number of elements.
numeric_only: This is optional. Pass True to include just float, int, or boolean data. False by default
Return Value:
If a level is given, it returns a scalar or a series. The unbiased variance of the values along the given axis is returned by the var() function of the Pandas Series.
Pandas Series var() Function in Python:
Example1
Approach:
- Import pandas module using the import keyword.
- Import numpy module using the import keyword.
- Give the category(level) values as arguments list to from_arrays() functions
- Pass some random list, index values from the above and name as Numbers as the arguments to the Series() function of the pandas module to create a series.
- Store it in a variable.
- Print the above-given series
- Printing the variance of all elements in the given series using the var() function
- Printing the variance of each level of the series using level=’DataType’
- Printing the variance of each level of the series using level=0.
-
The Exit of the Program.
Below is the implementation:
# Import pandas module using the import keyword.
import pandas as pd
# Import numpy module using the import keyword.
import numpy as np
# Give the category(level) values as arguments list to from_arrays() functions
gvn_indx = pd.MultiIndex.from_arrays([
['positive', 'negative', 'positive',
'positive', 'negative', 'negative']],
names=['DataType'])
# Pass some random list, index values from the above and name as Numbers
# as the arguments to the Series() function of the pandas module to create a series.
# Store it in a variable.
gvn_series = pd.Series([12, 3, 4, 5, 1, 2],
name='Numbers', index=gvn_indx)
# Print the above given series
print("The given series is:")
print(gvn_series)
print()
# Printing the variance of all elements in the given series
# using the var() function
print("The variance of all elements in the given series:")
print(gvn_series.var())
print()
# Printing the variance of each level of the series using level='DataType'
print("The variance of all level values using level='DataType':")
print(gvn_series.var(level='DataType'))
print()
# Printing the variance of each level of the series using level=0
print("The variance of all level values using level=0:")
print(gvn_series.var(level=0))
Output:
The given series is: DataType positive 12 negative 3 positive 4 positive 5 negative 1 negative 2 Name: Numbers, dtype: int64 The variance of all elements in the given series: 15.5 The variance of all level values using level='DataType': DataType positive 19.0 negative 1.0 Name: Numbers, dtype: float64 The variance of all level values using level=0: DataType positive 19.0 negative 1.0 Name: Numbers, dtype: float64
Example2
Here, the var() function is used on a specific series/column in a DataFrame.
Approach:
- Import pandas module using the import keyword.
- Pass some random key-value pair(dictionary), index list as arguments to the DataFrame() function of the pandas module to create a dataframe.
- Store it in a variable.
- Print the given dataframe.
- Apply var() function on the student_marks column of the dataframe to get the unbiased variance of all the values of the student_marks column and print the result.
-
The Exit of the Program.
Below is the implementation:
# Import pandas module using the import keyword.
import pandas as pd
# Pass some random key-value pair(dictionary), index list as arguments to the
# DataFrame() function of the pandas module to create a dataframe
# Store it in a variable.
data_frme = pd.DataFrame({
"student_rollno": [1, 2, 3, 4],
"student_marks": [80, 35, 25, 90]},
index= ["virat", "nick" , "jessy", "sindhu"]
)
# Print the given dataframe
print("The given Dataframe:")
print(data_frme)
print()
# Apply var() function on the student_marks column of the dataframe to
# get the unbiased variance of all the values of the student_marks
# column and print the result.
print("The unbiased variance of student_marks column of the dataframe:")
print(data_frme["student_marks"].var())
Output:
The given Dataframe:
student_rollno student_marks
virat 1 80
nick 2 35
jessy 3 25
sindhu 4 90
The unbiased variance of student_marks column of the dataframe:
1041.6666666666667