Pandas: Dataframe.fillna()

Dataframe.fillna() in Dataframes using Python

In this article, we will discuss how to use Dataframe.fillna() method with examples, like how to replace NaN values in a complete dataframe or some specific rows/columns

Dataframe.fillna()

Dataframe.fillna() is used to fill NaN values with some other values in Dataframe. This method widely came into use when there are fewer NaN values in any column so instead of dropping the whole column we replace the NaN or missing values of that column with some other values.

Syntax: DataFrame.fillna(value=None, method=None, axis=None, inplace=False, limit=None, downcast=None)

Parameters

1) Value: This parameter contains the values that we want to fill instead of NaN values. By default value is None.

2) method: The method parameter is used when the value doesn’t pass. There are different methods like backfill,bfill, etc. By default method is None.

3) axis: axis=1 means fill NaN values in columns and axis=0 means fill NaN values in rows.

4) inplace: It is a boolean which makes the changes in dataframe itself if True.

Different methods to use Dataframe.fillna() method

  • Method 1: Replace all NaN values in Dataframe

In this method, we normally pass some value in the value parameter and all the NaN values will be replaced with that value. Let see this with the help of an example.

import pandas as pd
import numpy as np
students = [('Raj', 24, 95) ,
            ('Rahul', np.NaN,97) ,
            ('Aadi', 22,81) ,
            ('Abhay', np.NaN,np.NaN) ,
            ('Ajjet', 21,74),
            ('Amar',np.NaN,np.NaN),
            ('Aman',np.NaN,76)]
# Create a DataFrame object
df = pd.DataFrame(  students, 
                    columns=['Name', 'Age','Marks'])
print("Original Dataframe\n")
print(df,'\n')
new_df=df.fillna(0)
print("New Dataframe\n")
print(new_df)

Output

Original Dataframe

    Name   Age  Marks
0    Raj  24.0   95.0
1  Rahul   NaN   97.0
2   Aadi  22.0   81.0
3  Abhay   NaN    NaN
4  Ajjet  21.0   74.0
5   Amar   NaN    NaN
6   Aman   NaN   76.0 

New Dataframe

    Name   Age  Marks
0    Raj  24.0   95.0
1  Rahul   0.0   97.0
2   Aadi  22.0   81.0
3  Abhay   0.0    0.0
4  Ajjet  21.0   74.0
5   Amar   0.0    0.0
6   Aman   0.0   76.0

Here we see that we replace all NaN values with 0.

  • Method 2- Replace all NaN values in specific columns

In this method, we replace all NaN values with some other values but only in specific columns not on the whole dataframe.

import pandas as pd
import numpy as np
students = [('Raj', 24, 95) ,
            ('Rahul', np.NaN,97) ,
            ('Aadi', 22,81) ,
            ('Abhay', np.NaN,np.NaN) ,
            ('Ajjet', 21,74),
            ('Amar',np.NaN,np.NaN),
            ('Aman',np.NaN,76)]
# Create a DataFrame object
df = pd.DataFrame(  students, 
                    columns=['Name', 'Age','Marks'])
print("Original Dataframe\n")
print(df,'\n')
df['Age'].fillna(0,inplace=True)
print("New Dataframe\n")
print(df)

Output

Original Dataframe

    Name   Age  Marks
0    Raj  24.0   95.0
1  Rahul   NaN   97.0
2   Aadi  22.0   81.0
3  Abhay   NaN    NaN
4  Ajjet  21.0   74.0
5   Amar   NaN    NaN
6   Aman   NaN   76.0 

New Dataframe

    Name   Age  Marks
0    Raj  24.0   95.0
1  Rahul   0.0   97.0
2   Aadi  22.0   81.0
3  Abhay   0.0    NaN
4  Ajjet  21.0   74.0
5   Amar   0.0    NaN
6   Aman   0.0   76.0

Here we see that the NaN value only in the Age column replaces with 0. Here we use inplace=’true’ because we want changes to be made in the original dataframe.

  • Method 3- Replace NaN values of one column with values of other columns

Here we pass the column in the value parameter of which we want the value to be copied.Let see this with help of an example.

import pandas as pd
import numpy as np
students = [('Raj', 24, 95) ,
            ('Rahul', np.NaN,97) ,
            ('Aadi', 22,81) ,
            ('Abhay', np.NaN,87) ,
            ('Ajjet', 21,74),
            ('Amar',np.NaN,76),
            ('Aman',np.NaN,76)]
# Create a DataFrame object
df = pd.DataFrame(  students, 
                    columns=['Name', 'Age','Marks'])
print("Original Dataframe\n")
print(df,'\n')
df['Age'].fillna(value=df['Marks'],inplace=True)
print("New Dataframe\n")
print(df)

Output

Original Dataframe

    Name   Age  Marks
0    Raj  24.0     95
1  Rahul   NaN     97
2   Aadi  22.0     81
3  Abhay   NaN     87
4  Ajjet  21.0     74
5   Amar   NaN     76
6   Aman   NaN     76 

New Dataframe

    Name   Age  Marks
0    Raj  24.0     95
1  Rahul  97.0     97
2   Aadi  22.0     81
3  Abhay  87.0     87
4  Ajjet  21.0     74
5   Amar  76.0     76
6   Aman  76.0     76

Here we see NaN values of the Age column are replaced with non NaN value of the Marks Column.

  • Method 4-Replace NaN values in specific rows

To replace NaN values in a row we need to use .loc[‘index name’] to access a row in a dataframe, then we will call the fillna() function on that row. Let see this with help of an example.

import pandas as pd
import numpy as np
students = [('Raj', 24, 95) ,
            ('Rahul', np.NaN,97) ,
            ('Aadi', 22,81) ,
            ('Abhay', np.NaN,87) ,
            ('Ajjet', 21,74),
            ('Amar',np.NaN,76),
            ('Aman',np.NaN,76)]
# Create a DataFrame object
df = pd.DataFrame(  students, 
                    columns=['Name', 'Age','Marks'])
print("Original Dataframe\n")
print(df,'\n')
df.loc[1]=df.loc[1].fillna(value=0)
print("New Dataframe\n")
print(df)

Output

Original Dataframe

    Name   Age  Marks
0    Raj  24.0     95
1  Rahul   NaN     97
2   Aadi  22.0     81
3  Abhay   NaN     87
4  Ajjet  21.0     74
5   Amar   NaN     76
6   Aman   NaN     76 

New Dataframe

    Name   Age  Marks
0    Raj  24.0     95
1  Rahul   0.0     97
2   Aadi  22.0     81
3  Abhay   NaN     87
4  Ajjet  21.0     74
5   Amar   NaN     76
6   Aman   NaN     76

So these are some of the ways to use Dataframe.fillna().