Pandas : Change data type of single or multiple columns of Dataframe in Python

Changeing data type of single or multiple columns of Dataframe in Python

In this article we will see how we can change the data type of a single or multiple column of Dataframe in Python.

Change Data Type of a Single Column :

We will use series.astype() to change the data type of columns

Syntax:- Series.astype(self, dtype, copy=True, errors='raise', **kwargs)

where Arguments:

  • dtype : It is python type to which whole series object will get converted.
  • errors : It is a way of handling errors, which can be ignore/ raise and default value is ‘raised’. (raise- Raise exception in case of invalid parsing , ignore- Return the input as original in case of invalid parsing
  • copy : bool (Default value is True) (If False- Will make change in current object , If True- Return a copy)

Returns: If copy argument is true, new Series object with updated type is returned.

import pandas as sc
# List of Tuples
students = [('Rohit', 34, 'Swimming', 155) ,
        ('Ritik', 25, 'Cricket' , 179) ,
        ('Salim', 26, 'Music', 187) ,
        ('Rani', 29,'Sleeping' , 154) ,
        ('Sonu', 17, 'Singing' , 184) ,
        ('Madhu', 20, 'Travelling', 165 ),
        ('Devi', 22, 'Art', 141)
        ]
# Create a DataFrame object with different data type of column
studObj = sc.DataFrame(students, columns=['Name', 'Age', 'Hobby', 'Height'])
print(studObj)
print(studObj.dtypes)
Output :
Name        Age       Hobby         Height
0  Rohit      34        Swimming     155
1  Ritik        25        Cricket          179
2  Salim      26        Music            187
3   Rani       29       Sleeping         154
4   Sonu      17       Singing          184
5  Madhu    20       Travelling       165
6   Devi       22        Art                141

Name      object
Age          int64
Hobby     object
Height     int64
dtype:      object

Change data type of a column from int64 to float64 :

We can change data type of a column a column e.g.  Let’s try changing data type of ‘Age’ column from int64 to float64. For this we have to write Float64 in astype() which will get reflected in dataframe.

import pandas as sc
# List of Tuples
students = [('Rohit', 34, 'Swimming', 155) ,
        ('Ritik', 25, 'Cricket' , 179) ,
        ('Salim', 26, 'Music', 187) ,
        ('Rani', 29,'Sleeping' , 154) ,
        ('Sonu', 17, 'Singing' , 184) ,
        ('Madhu', 20, 'Travelling', 165 ),
        ('Devi', 22, 'Art', 141)
        ]
# Create a DataFrame object with different datatype of column
studObj = sc.DataFrame(students, columns=['Name', 'Age', 'Hobby', 'Height'])
# Change data type of column 'Age' to float64
studObj['Age'] = studObj['Age'].astype('float64')
print(studObj)
print(studObj.dtypes)
Output :
Name   Age       Hobby  Height
0  Rohit  34.0    Swimming     155
1  Ritik  25.0     Cricket     179
2  Salim  26.0       Music     187
3   Rani  29.0    Sleeping     154
4   Sonu  17.0     Singing     184
5  Madhu  20.0  Travelling     165
6   Devi  22.0         Art     141
Name       object
Age           float64
Hobby      object
Height      int64
dtype: object

Change data type of a column from int64 to string :

Let’s try to change the data type of ‘Height’ column to string i.e. Object type. As we know by default value of astype() was True, so it returns a copy of passed series with changed Data type which will be assigned to studObj['Height'].

import pandas as sc
# List of Tuples
students = [('Rohit', 34, 'Swimming', 155) ,
        ('Ritik', 25, 'Cricket' , 179) ,
        ('Salim', 26, 'Music', 187) ,
        ('Rani', 29,'Sleeping' , 154) ,
        ('Sonu', 17, 'Singing' , 184) ,
        ('Madhu', 20, 'Travelling', 165 ),
        ('Devi', 22, 'Art', 141)
        ]
studObj = sc.DataFrame(students, columns=['Name', 'Age', 'Hobby', 'Height'])
# Change data type of column 'Marks' from int64 to float64
studObj['Age'] = studObj['Age'].astype('float64')
# Change data type of column 'Marks' from int64 to Object type or string
studObj['Height'] = studObj['Height'].astype('object')
print(studObj)
print(studObj.dtypes)
Output :
Name   Age       Hobby Height
0  Rohit  34.0    Swimming    155
1  Ritik  25.0     Cricket    179
2  Salim  26.0       Music    187
3   Rani  29.0    Sleeping    154
4   Sonu  17.0     Singing    184
5  Madhu  20.0  Travelling    165
6   Devi  22.0         Art    141
Name       object
Age           float64
Hobby      object
Height     object
dtype: object

Change Data Type of Multiple Columns in Dataframe :

To change the datatype of multiple column in Dataframe we will use DataFeame.astype() which can be applied for whole dataframe or selected columns.

Synatx:- DataFrame.astype(self, dtype, copy=True, errors='raise', **kwargs)

Arguments:

  • dtype : It is python type to which whole series object will get converted. (Dictionary of column names and data types where given colum will be converted to corrresponding types.)
  • errors : It is a way of handling errors, which can be ignore/ raise and default value is ‘raised’.
  • raise : Raise exception in case of invalid parsing
  • ignore : Return the input as original in case of invalid parsing
  • copy : bool (Default value is True) (If False- Will make change in current object , If True- Return a copy)

Returns: If copy argument is true, new Series object with updated type is returned.

Change Data Type of two Columns at same time :

Let’s try to convert columns ‘Age’ & ‘Height of int64 data type to float64 & string respectively. We will pass a Dictionary to Dataframe.astype() where it contain column name as keys and new data type as values.

import pandas as sc
# List of Tuples
students = [('Rohit', 34, 'Swimming', 155) ,
        ('Ritik', 25, 'Cricket' , 179) ,
        ('Salim', 26, 'Music', 187) ,
        ('Rani', 29,'Sleeping' , 154) ,
        ('Sonu', 17, 'Singing' , 184) ,
        ('Madhu', 20, 'Travelling', 165 ),
        ('Devi', 22, 'Art', 141)
        ]
# Create a DataFrame object with different datatype of column
studObj = sc.DataFrame(students, columns=['Name', 'Age', 'Hobby', 'Height'])
# Convert the data type of column Age to float64 & column Marks to string
studObj = studObj.astype({'Age': 'float64', 'Height': 'object'})
print(studObj)
print(studObj.dtypes)
Output :
Name   Age       Hobby Height
0  Rohit  34.0    Swimming    155
1  Ritik  25.0     Cricket    179
2  Salim  26.0       Music    187
3   Rani  29.0    Sleeping    154
4   Sonu  17.0     Singing    184
5  Madhu  20.0  Travelling    165
6   Devi  22.0         Art    141
Name       object
Age           float64
Hobby      object
Height     object
dtype: object

Handle errors while converting Data Types of Columns :

Using astype() to convert either a column or multiple column we can’t pass the content which can’t be typecasted. Otherwise error will be produced.

import pandas as sc
# List of Tuples
students = [('Rohit', 34, 'Swimming', 155) ,
        ('Ritik', 25, 'Cricket' , 179) ,
        ('Salim', 26, 'Music', 187) ,
        ('Rani', 29,'Sleeping' , 154) ,
        ('Sonu', 17, 'Singing' , 184) ,
        ('Madhu', 20, 'Travelling', 165 ),
        ('Devi', 22, 'Art', 141)
        ]
# Create a DataFrame object with different datatype of column
studObj = sc.DataFrame(students, columns=['Name', 'Age', 'Hobby', 'Height'])
# Trying to change dataype of a column with unknown dataype
try:
        studObj['Name'] = studObj['Name'].astype('xyz')
except TypeError as ex:
        print(ex)

Output :
data type "xyz" not understood

Want to expert in the python programming language? Exploring Python Data Analysis using Pandas tutorial changes your knowledge from basic to advance level in python concepts.

Read more Articles on Python Data Analysis Using Padas – Modify a Dataframe