Method to count Nan and missing value in data frames using pandas
Pandas count missing values: In this article, we will discuss null values in data frames and calculate them in rows, columns, and in total. Let discuss nan or missing values in the dataframe.
NaN or Missing values
Pandas count nan: The full form of NaN is Not A Number
.It is used to represent missing values in the dataframe. Let see this with an example.
students= {'students': ['Raj', 'Rahul', 'Mayank', 'Ajay', 'Amar'], 'Marks':[90,np.nan,87,np.nan,19]} df = pd.DataFrame(num, columns=['students','Marks']) print(df)
Output
students Marks 0 Raj 90.0 1 Rahul NaN 2 Mayank 87.0 3 Ajay NaN 4 Amar 19.0
Here we see that there are NaN inside the Marks column that is used to represent missing values.
- Pandas: Drop Rows With NaN/Missing Values in any or Selected Columns of Dataframe
- Pandas: Replace NaN with mean or average in Dataframe using fillna()
- Pandas: Dataframe.fillna()
Reason to count Missing values or NaN values in Dataframe
Pandas count nan in column: One of the main reasons to count missing values is that missing values in any dataframe affects the accuracy of prediction. If there are more missing values in the dataframe then our prediction or result highly effect. Hence we calculate missing values. If there are the high count of missing values we can drop them else we can leave them as it is in dataframe.
Method to count NaN or missing values
To use count or missing value first we use a function isnull(). This function replaces all NaN value with True and non-NaN values with False which helps us to calculate the count of NaN or missing values. Let see this with the help of an example.
students= {'students': ['Raj', 'Rahul', 'Mayank', 'Ajay', 'Amar'], 'Marks':[90,np.nan,87,np.nan,19]} df = pd.DataFrame(num, columns=['students','Marks']) df.isnull()
Output
students Marks 0 False False 1 False True 2 False False 3 False NaN 4 False True
We get our dataframe something like this. Now we can easily calculate the count of NaN or missing values in the dataframe.
Count NaN or missing values in columns
With the help of .isnull().sum() method, we can easily calculate the count of NaN or missing values. isnull() method converts NaN values to True and non-NaN values to false and then the sum() method calculates the number of false in respective columns. Let see this with an example.
students= {'students': ['Raj', 'Rahul', 'Mayank', 'Ajay', 'Amar'], 'Marks':[90,np.nan,87,np.nan,19]} df = pd.DataFrame(num, columns=['students','Marks']) df.isnull().sum()
Output
students 0 Marks 2 dtype: int64
As we also see in the dataframe that we have no NaN or missing values in the students column but we have 2 in the Marks column.
Count NaN or missing values in Rows
For this, we can iterate through each row using for loop and then using isnull().sum() method calculates NaN or missing values in all the rows. Let see this with an example.
students= {'students': ['Raj', 'Rahul', 'Mayank', 'Ajay', 'Amar'], 'Marks':[90,np.nan,87,np.nan,19]} df = pd.DataFrame(num, columns=['students','Marks']) for i in range(len(df.index)) : print("Nan in row ", i , " : " , df.iloc[i].isnull().sum())
Output
Nan in row 0 : 0 Nan in row 1 : 1 Nan in row 2 : 0 Nan in row 3 : 1 Nan in row 4 : 0
Count total NaN or missing values in dataframe
In the above two examples, we see how to calculate missing values or NaN in rows or columns. Now we see how to calculate the total missing value in the dataframe For this we have to simply use isnull().sum().sum() method and we get our desired output. Let see this with help of an example.
students= {'students': ['Raj', 'Rahul', 'Mayank', 'Ajay', 'Amar'], 'Marks':[90,np.nan,87,np.nan,19]} df = pd.DataFrame(num, columns=['students','Marks']) print("Total NaN values: ",df.isnull().sum().sum())
Output
Total NaN values: 2
So these are the methods tp count NaN or missing values in dataframes.
Want to expert in the python programming language? Exploring Python Data Analysis using Pandas tutorial changes your knowledge from basic to advance level in python concepts.
Read more Articles on Python Data Analysis Using Padas
- How to merge Dataframes using Dataframe.merge() in Python?
- How to merge Dataframes on specific columns or on index in Python?
- How to merge Dataframes by index using Dataframe.merge()?
- Count rows in a dataframe | all or those only that satisfy a condition
- 6 Different ways to iterate over rows in a Dataframe & Update while iterating row by row
- Loop or Iterate over all or certain columns of a DataFrame
- How to display full Dataframe i.e. print all rows & columns without truncation