Dataframe.fillna() in Dataframes using Python
In this article, we will discuss how to use Dataframe.fillna() method with examples, like how to replace NaN values in a complete dataframe or some specific rows/columns
Dataframe.fillna()
Dataframe.fillna() is used to fill NaN values with some other values in Dataframe. This method widely came into use when there are fewer NaN values in any column so instead of dropping the whole column we replace the NaN or missing values of that column with some other values.
Syntax: DataFrame.fillna(value=None, method=None, axis=None, inplace=False, limit=None, downcast=None)
Parameters
1) Value: This parameter contains the values that we want to fill instead of NaN values. By default value is None.
2) method: The method parameter is used when the value doesn’t pass. There are different methods like backfill,bfill, etc. By default method is None.
3) axis: axis=1 means fill NaN values in columns and axis=0 means fill NaN values in rows.
4) inplace: It is a boolean which makes the changes in dataframe itself if True.
- Pandas: Replace NaN with mean or average in Dataframe using fillna()
- Pandas: Drop Rows With NaN/Missing Values in any or Selected Columns of Dataframe
- Python Pandas: How to display full Dataframe i.e. print all rows & columns without truncation
Different methods to use Dataframe.fillna() method
-
Method 1: Replace all NaN values in Dataframe
In this method, we normally pass some value in the value parameter and all the NaN values will be replaced with that value. Let see this with the help of an example.
import pandas as pd import numpy as np students = [('Raj', 24, 95) , ('Rahul', np.NaN,97) , ('Aadi', 22,81) , ('Abhay', np.NaN,np.NaN) , ('Ajjet', 21,74), ('Amar',np.NaN,np.NaN), ('Aman',np.NaN,76)] # Create a DataFrame object df = pd.DataFrame( students, columns=['Name', 'Age','Marks']) print("Original Dataframe\n") print(df,'\n') new_df=df.fillna(0) print("New Dataframe\n") print(new_df)
Output
Original Dataframe Name Age Marks 0 Raj 24.0 95.0 1 Rahul NaN 97.0 2 Aadi 22.0 81.0 3 Abhay NaN NaN 4 Ajjet 21.0 74.0 5 Amar NaN NaN 6 Aman NaN 76.0 New Dataframe Name Age Marks 0 Raj 24.0 95.0 1 Rahul 0.0 97.0 2 Aadi 22.0 81.0 3 Abhay 0.0 0.0 4 Ajjet 21.0 74.0 5 Amar 0.0 0.0 6 Aman 0.0 76.0
Here we see that we replace all NaN values with 0.
-
Method 2- Replace all NaN values in specific columns
In this method, we replace all NaN values with some other values but only in specific columns not on the whole dataframe.
import pandas as pd import numpy as np students = [('Raj', 24, 95) , ('Rahul', np.NaN,97) , ('Aadi', 22,81) , ('Abhay', np.NaN,np.NaN) , ('Ajjet', 21,74), ('Amar',np.NaN,np.NaN), ('Aman',np.NaN,76)] # Create a DataFrame object df = pd.DataFrame( students, columns=['Name', 'Age','Marks']) print("Original Dataframe\n") print(df,'\n') df['Age'].fillna(0,inplace=True) print("New Dataframe\n") print(df)
Output
Original Dataframe Name Age Marks 0 Raj 24.0 95.0 1 Rahul NaN 97.0 2 Aadi 22.0 81.0 3 Abhay NaN NaN 4 Ajjet 21.0 74.0 5 Amar NaN NaN 6 Aman NaN 76.0 New Dataframe Name Age Marks 0 Raj 24.0 95.0 1 Rahul 0.0 97.0 2 Aadi 22.0 81.0 3 Abhay 0.0 NaN 4 Ajjet 21.0 74.0 5 Amar 0.0 NaN 6 Aman 0.0 76.0
Here we see that the NaN value only in the Age column replaces with 0. Here we use inplace=’true’ because we want changes to be made in the original dataframe.
-
Method 3- Replace NaN values of one column with values of other columns
Here we pass the column in the value parameter of which we want the value to be copied.Let see this with help of an example.
import pandas as pd import numpy as np students = [('Raj', 24, 95) , ('Rahul', np.NaN,97) , ('Aadi', 22,81) , ('Abhay', np.NaN,87) , ('Ajjet', 21,74), ('Amar',np.NaN,76), ('Aman',np.NaN,76)] # Create a DataFrame object df = pd.DataFrame( students, columns=['Name', 'Age','Marks']) print("Original Dataframe\n") print(df,'\n') df['Age'].fillna(value=df['Marks'],inplace=True) print("New Dataframe\n") print(df)
Output
Original Dataframe Name Age Marks 0 Raj 24.0 95 1 Rahul NaN 97 2 Aadi 22.0 81 3 Abhay NaN 87 4 Ajjet 21.0 74 5 Amar NaN 76 6 Aman NaN 76 New Dataframe Name Age Marks 0 Raj 24.0 95 1 Rahul 97.0 97 2 Aadi 22.0 81 3 Abhay 87.0 87 4 Ajjet 21.0 74 5 Amar 76.0 76 6 Aman 76.0 76
Here we see NaN values of the Age column are replaced with non NaN value of the Marks Column.
-
Method 4-Replace NaN values in specific rows
To replace NaN values in a row we need to use .loc[‘index name’] to access a row in a dataframe, then we will call the fillna() function on that row. Let see this with help of an example.
import pandas as pd import numpy as np students = [('Raj', 24, 95) , ('Rahul', np.NaN,97) , ('Aadi', 22,81) , ('Abhay', np.NaN,87) , ('Ajjet', 21,74), ('Amar',np.NaN,76), ('Aman',np.NaN,76)] # Create a DataFrame object df = pd.DataFrame( students, columns=['Name', 'Age','Marks']) print("Original Dataframe\n") print(df,'\n') df.loc[1]=df.loc[1].fillna(value=0) print("New Dataframe\n") print(df)
Output
Original Dataframe
Name Age Marks
0 Raj 24.0 95
1 Rahul NaN 97
2 Aadi 22.0 81
3 Abhay NaN 87
4 Ajjet 21.0 74
5 Amar NaN 76
6 Aman NaN 76
New Dataframe
Name Age Marks
0 Raj 24.0 95
1 Rahul 0.0 97
2 Aadi 22.0 81
3 Abhay NaN 87
4 Ajjet 21.0 74
5 Amar NaN 76
6 Aman NaN 76
So these are some of the ways to use Dataframe.fillna().