Pandas skew – Python Pandas Series skew() Function

Pandas Series skew() Function:

Pandas skew: The skew() function of the Pandas Series returns the unbiased skew of the values along the chosen axis.

Skewness is asymmetry in a statistical distribution in which the curve seems distorted or skewed to the left or right.

Syntax:

Series.skew(axis=None, skipna=None, level=None, numeric_only=None)

Parameters

axis: This is optional. It indicates 0 or ‘index’. This is the axis on which the function will be applied.

skipna: This is optional. When computing the result, specify True to exclude NA/null values. The default value is True.

level: This is optional. It indicates the level (int or str). If the axis is a MultiIndex (hierarchical), count along a particular level, collapsing into a scalar. The level name is specified by str.

numeric_only: This is optional. Pass True to include just float, int, or boolean data. False by default

Return Value:

If a level is given, it returns a scalar or a series. The unbiased skew along the given axis is returned by the skew() function of the Pandas Series.

Pandas Series skew() Function in Python

Example1

Approach:

  • Import pandas module using the import keyword.
  • Import numpy module using the import keyword.
  • Give the category(level) values as arguments list to from_arrays() functions
  • Pass some random list, index values from the above and name as Numbers as the arguments to the Series() function of the pandas module to create a series.
  • Store it in a variable.
  • Print the above-given series
  • Printing the skew of all elements in the given series using the skew() function
  • Printing the skew of each level of the series using level=’DataType’
  • Printing the skew of each level of the series using level=0.
  • The Exit of the Program.

Below is the implementation:

# Import pandas module using the import keyword.
import pandas as pd
# Import numpy module using the import keyword.
import numpy as np
# Give the category(level) values as arguments list to from_arrays() functions
gvn_indx = pd.MultiIndex.from_arrays([
    ['positive', 'negative', 'positive', 
     'positive', 'negative', 'negative']],
    names=['DataType'])
# Pass some random list, index values from the above and name as Numbers
# as the arguments to the Series() function of the pandas module to create a series.
# Store it in a variable.
gvn_series = pd.Series([2, 3, 4, 5, 1, 2], 
              name='Numbers', index=gvn_indx)
# Print the above given series
print("The given series is:")
print(gvn_series)
print()

# Printing the skew of all elements in the given series 
# using the skew() function
print("Skew of all elements in the given series:")
print(gvn_series.skew())
print()
# Printing the skew of each level of the series using level='DataType'
print("Skew of all level values using level='DataType':")
print(gvn_series.skew(level='DataType'))
print()
# Printing the skew of each level of the series using level=0
print("Skew of all level values using level=0:")
print(gvn_series.skew(level=0))

Output:

The given series is:
DataType
positive    2
negative    3
positive    4
positive    5
negative    1
negative    2
Name: Numbers, dtype: int64

Skew of all elements in the given series:
0.4180715202995426

Skew of all level values using level='DataType':
DataType
positive   -0.93522
negative    0.00000
Name: Numbers, dtype: float64

Skew of all level values using level=0:
DataType
positive   -0.93522
negative    0.00000
Name: Numbers, dtype: float64

Example2

Approach:

  • Import pandas module using the import keyword.
  • Pass some random key-value pair(dictionary), index list as arguments to the DataFrame() function of the pandas module to create a dataframe.
  • Store it in a variable.
  • Print the given dataframe.
  • Apply skew() function on the student_marks column of the dataframe to get the unbiased skew of all the values of the student_marks column and print the result.
  • The Exit of the Program.

Below is the implementation:

# Import pandas module using the import keyword.
import pandas as pd
# Pass some random key-value pair(dictionary), index list as arguments to the 
# DataFrame() function of the pandas module to create a dataframe
# Store it in a variable.
data_frme = pd.DataFrame({
  "student_rollno": [1, 2, 3, 4],
  "student_marks": [75, 79, 87, 89]},
  index= ["virat", "nick" , "jessy", "sindhu"]
)
# Print the given dataframe
print("The given Dataframe:")
print(data_frme)
print()
# Apply skew() function on the student_marks column of the dataframe to
# get the unbiased skew of all the values of the student_marks
# column and print the result.
print("The unbiased skew of student_marks column of the dataframe:")
print(data_frme["student_marks"].skew())

Output:

The given Dataframe:
        student_rollno  student_marks
virat                1             75
nick                 2             79
jessy                3             87
sindhu               4             89

The unbiased skew of student_marks column of the dataframe:
-0.228727758587297