Methods to add two columns into a new column in Dataframe
In this article, we discuss how to add to column to an existing column in the dataframe and how to add two columns to make a new column in the dataframe using pandas. We will also discuss how to deal with NaN values.
-
Method 1-Sum two columns together to make a new series
In this method, we simply select two-column by their column name and then simply add them.Let see this with the help of an example.
import pandas as pd import numpy as np students = [('Raj', 24, 'Mumbai', 95) , ('Rahul', 21, 'Delhi' , 97) , ('Aadi', 22, np.NaN, 81) , ('Abhay', 25,'Rajasthan' , 90) , ('Ajjet', 21, 'Delhi' , 74)] # Create a DataFrame object df = pd.DataFrame( students, columns=['Name', 'Age', 'City', 'Marks']) print("Original Dataframe\n") print(df,'\n') total = df['Age'] + df['Marks'] print("New Series \n") print(total) print(type(total))
Output
Original Dataframe Name Age City Marks 0 Raj 24 Mumbai 95 1 Rahul 21 Delhi 97 2 Aadi 22 NaN 81 3 Abhay 25 Rajasthan 90 4 Ajjet 21 Delhi 74 New Series 0 119 1 118 2 103 3 115 4 95 dtype: int64 <class 'pandas.core.series.Series'>
- Pandas: Drop Rows With NaN/Missing Values in any or Selected Columns of Dataframe
- Append/Add Row to Dataframe in Pandas – dataframe.append() | How to Insert Rows to Pandas Dataframe?
- Pandas: Delete last column of dataframe in python | How to Remove last column from Dataframe in Python?
Here we see that when we add two columns then a series will be formed.]
Note: We can’t add a string with int or float. We can only add a string with a string or a number with a number.
Let see the example of adding string with string.
import pandas as pd import numpy as np students = [('Raj', 24, 'Mumbai', 95) , ('Rahul', 21, 'Delhi' , 97) , ('Aadi', 22, 'Kolkata', 81) , ('Abhay', 25,'Rajasthan' , 90) , ('Ajjet', 21, 'Delhi' , 74)] # Create a DataFrame object df = pd.DataFrame( students, columns=['Name', 'Age', 'City', 'Marks']) print("Original Dataframe\n") print(df,'\n') total = df['Name'] + " "+df['City'] print("New Series \n") print(total) print(type(total))
Output
Original Dataframe Name Age City Marks 0 Raj 24 Mumbai 95 1 Rahul 21 Delhi 97 2 Aadi 22 Kolkata 81 3 Abhay 25 Rajasthan 90 4 Ajjet 21 Delhi 74 New Series 0 Raj Mumbai 1 Rahul Delhi 2 Aadi Kolkata 3 Abhay Rajasthan 4 Ajjet Delhi dtype: object <class 'pandas.core.series.Series'>
-
Method 2-Sum two columns together having NaN values to make a new series
In the previous method, there is no NaN or missing values but in this case, we also have NaN values. So when we add two columns in which one or two-column contains NaN values then we will see that we also get the result as NaN. Let see this with the help of an example.
import pandas as pd import numpy as np students = [('Raj', 24, 'Mumbai', 95) , ('Rahul', 21, 'Delhi' , 97) , ('Aadi', 22, 'Kolkata', np.NaN) , ('Abhay', np.NaN,'Rajasthan' , 90) , ('Ajjet', 21, 'Delhi' , 74)] # Create a DataFrame object df = pd.DataFrame( students, columns=['Name', 'Age', 'City', 'Marks']) print("Original Dataframe\n") print(df,'\n') total = df['Marks'] + df['Age'] print("New Series \n") print(total) print(type(total))
Output
Original Dataframe Name Age City Marks 0 Raj 24.0 Mumbai 95.0 1 Rahul 21.0 Delhi 97.0 2 Aadi 22.0 Kolkata NaN 3 Abhay NaN Rajasthan 90.0 4 Ajjet 21.0 Delhi 74.0 New Series 0 119.0 1 118.0 2 NaN 3 NaN 4 95.0 dtype: float64 <class 'pandas.core.series.Series'>
-
Method 3-Add two columns to make a new column
We know that a dataframe is a group of series. We see that when we add two columns it gives us a series and we store that sum in a variable. If we make that variable a column in the dataframe then our work will be easily done. Let see this with the help of an example.
import pandas as pd import numpy as np students = [('Raj', 24, 'Mumbai', 95) , ('Rahul', 21, 'Delhi' , 97) , ('Aadi', 22, 'Kolkata',76) , ('Abhay',23,'Rajasthan' , 90) , ('Ajjet', 21, 'Delhi' , 74)] # Create a DataFrame object df = pd.DataFrame( students, columns=['Name', 'Age', 'City', 'Marks']) print("Original Dataframe\n") print(df,'\n') df['total'] = df['Marks'] + df['Age'] print("New Dataframe \n") print(df) print(df)
Output
Original Dataframe Name Age City Marks 0 Raj 24 Mumbai 95 1 Rahul 21 Delhi 97 2 Aadi 22 Kolkata 76 3 Abhay 23 Rajasthan 90 4 Ajjet 21 Delhi 74 New Dataframe Name Age City Marks total 0 Raj 24 Mumbai 95 119 1 Rahul 21 Delhi 97 118 2 Aadi 22 Kolkata 76 98 3 Abhay 23 Rajasthan 90 113 4 Ajjet 21 Delhi 74 95
-
Method 4-Add two columns with NaN values to make a new column
The same is the case with NaN values. But here NaN values will be shown.Let see this with the help of an example.
import pandas as pd import numpy as np students = [('Raj', 24, 'Mumbai', 95) , ('Rahul', 21, 'Delhi' , 97) , ('Aadi', 22, 'Kolkata', np.NaN) , ('Abhay', np.NaN,'Rajasthan' , 90) , ('Ajjet', 21, 'Delhi' , 74)] # Create a DataFrame object df = pd.DataFrame( students, columns=['Name', 'Age', 'City', 'Marks']) print("Original Dataframe\n") print(df,'\n') df['total'] = df['Marks'] + df['Age'] print("New Dataframe \n") print(df)
Output
Original Dataframe Name Age City Marks 0 Raj 24.0 Mumbai 95.0 1 Rahul 21.0 Delhi 97.0 2 Aadi 22.0 Kolkata NaN 3 Abhay NaN Rajasthan 90.0 4 Ajjet 21.0 Delhi 74.0 New Dataframe Name Age City Marks total 0 Raj 24.0 Mumbai 95.0 119.0 1 Rahul 21.0 Delhi 97.0 118.0 2 Aadi 22.0 Kolkata NaN NaN 3 Abhay NaN Rajasthan 90.0 NaN 4 Ajjet 21.0 Delhi 74.0 95.0
So these are the methods to add two columns in the dataframe.