How to get & check data types of dataframes columns in python pandas ?
Dataframe data types: In this article we will discuss different ways to get the data type of single or multiple columns.
Use Dataframe.dtype to get data types of columns in Dataframe :
In python’s pandas module provides Dataframe class as a container for storing and manipulating two-dimensional data which provides an attribute to get the data type information of each column.
This Dataframe.dtype
returns a series mentioned with the data type of each column.
Let’s try with an example:
#Program : import pandas as pd import numpy as np #list of tuples game = [('riya',37,'delhi','cat','rose'), ('anjali',28,'agra','dog','lily'), ('tia',42,'jaipur','elephant','lotus'), ('kapil',51,'patna','cow','tulip'), ('raj',30,'banglore','lion','orchid')] #Create a dataframe object df = pd.DataFrame(game, columns=['Name','Age','Place','Animal','Flower'], index=['a','b','c','d','e']) print(df)
Output: Name Age Place Animal Flower a riya 37 delhi cat rose b anjali 28 agra dog lily c tia 42 jaipur elephant lotus d kapil 51 patna cow tulip e raj 30 banglore lion orchid
- Python Data Persistence – Excel with Pandas
- Pandas: Drop Rows With NaN/Missing Values in any or Selected Columns of Dataframe
- How to Find and Drop duplicate columns in a DataFrame | Python Pandas
This is the contents of the dataframe. Now let’s fetch the data types of each column in dataframe.
#Program : import pandas as pd import numpy as np #list of tuples game = [('riya',37,'delhi','cat','rose'), ('anjali',28,'agra','dog','lily'), ('tia',42,'jaipur','elephant','lotus'), ('kapil',51,'patna','cow','tulip'), ('raj',30,'banglore','lion','orchid')] #Create a dataframe object df = pd.DataFrame(game, columns=['Name','Age','Place','Animal','Flower'], index=['a','b','c','d','e']) DataType = df.dtypes print('Data type of each column:') print(DataType)
Output: Data type of each column: Name object Age int64 Place object Animal object Flower object dtype: object
Get Data types of dataframe columns as dictionary :
#Program : import pandas as pd import numpy as np #list of tuples game = [('riya',37,'delhi','cat','rose'), ('anjali',28,'agra','dog','lily'), ('tia',42,'jaipur','elephant','lotus'), ('kapil',51,'patna','cow','tulip'), ('raj',30,'banglore','lion','orchid')] #Create a dataframe object df = pd.DataFrame(game, columns=['Name','Age','Place','Animal','Flower'], index=['a','b','c','d','e']) #get a dictionary containing the pairs of column names and data types object DataTypeDict = dict(df.dtypes) print('Data type of each column :') print(DataTypeDict)
Output: Data type of each column :{'Name': dtype('O'), 'Age': dtype('int64'), 'Place': dtype('O'), 'Animal': dtype('O'), 'Flower': dtype('O')}
Get the data type of a single column in dataframe :
By using Dataframe.dtype
s we can also get the data type of a single column from a series of objects.
#Program : import pandas as pd import numpy as np #list of tuples game = [('riya',37,'delhi','cat','rose'), ('anjali',28,'agra','dog','lily'), ('tia',42,'jaipur','elephant','lotus'), ('kapil',51,'patna','cow','tulip'), ('raj',30,'banglore','lion','orchid')] #Create a dataframe object df = pd.DataFrame(game, columns=['Name','Age','Place','Animal','Flower'], index=['a','b','c','d','e']) #get a dictionary containing the pairs of column names and data types object DataTypeObj = df.dtypes['Age'] print('Data type of each column Age : ') print(DataTypeObj)
Output : Data type of each column Age :int64
Get list of pandas dataframe column names based on data types :
Suppose, we want a list of column names based on datatypes. Let’s take an example program whose data type is object(string).
import pandas as pd import numpy as np #list of tuples game = [('riya',37,'delhi','cat','rose'), ('anjali',28,'agra','dog','lily'), ('tia',42,'jaipur','elephant','lotus'), ('kapil',51,'patna','cow','tulip'), ('raj',30,'banglore','lion','orchid')] #Create a dataframe object df = pd.DataFrame(game, columns=['Name','Age','Place','Animal','Flower'], index=['a','b','c','d','e']) # Get columns whose data type is object means string filteredColumns = df.dtypes[df.dtypes == np.object] # list of columns whose data type is object means string listOfColumnNames = list(filteredColumns.index) print(listOfColumnNames)
Output: ['Name', 'Place', 'Animal', 'Flower']
Get data types of a dataframe using Dataframe.info() :
Dataframe.info()
function is used to get simple summary of a dataframe. By using this method we can get information about a dataframe including the index dtype and column dtype, non-null values and memory usage.
#program : import pandas as pd import numpy as np #list of tuples game = [('riya',37,'delhi','cat','rose'), ('anjali',28,'agra','dog','lily'), ('tia',42,'jaipur','elephant','lotus'), ('kapil',51,'patna','cow','tulip'), ('raj',30,'banglore','lion','orchid')] #Create a dataframe object df = pd.DataFrame(game, columns=['Name','Age','Place','Animal','Flower'], index=['a','b','c','d','e']) df.info()
Output: <class 'pandas.core.frame.DataFrame'> Index: 5 entries, a to e Data columns (total 5 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- a Name 5 non-null object b Age 5 non-null int64 c Place 5 non-null object d Animal 5 non-null object e Flower 5 non-null object dtypes: int64(1), object(4) memory usage: 240.0+ bytes