Methods of creating a dataframe from a list of dictionaries
In this article, we discuss different methods by which we can create a dataframe from a list of dictionaries. Before going to the actual article let us done some observations that help to understand the concept easily. Suppose we have a list of dictionary:-
list_of_dict = [
{'Name': 'Mayank' , 'Age': 25, 'Marks': 91},
{'Name': 'Raj', 'Age': 21, 'Marks': 97},
{'Name': 'Rahul', 'Age': 23, 'Marks': 79},
{'Name': 'Manish' , 'Age': 23},
]
Here we know that dictionaries consist of key-value pairs. So we can analyze that if we make the key as our column name and values as the column value then a dataframe is easily created. And we have a list of dictionaries so a dataframe with multiple rows also.
pandas.DataFrame
This methods helps us to create dataframe in python
syntax: pandas.DataFrame(data=None, index=None, columns=None, dtype=None, copy=False)
Let us see different methods to create dataframe from a list of dictionaries
-
Method 1-Create Dataframe from list of dictionaries with default indexes
As we see in in pandas.Datframe() method there is parameter name data.We have to simply pass our list of dictionaries in this method and it will return the dataframe.Let see this with the help of an example.
import pandas as pd import numpy as np list_of_dict = [ {'Name': 'Mayank' , 'Age': 25, 'Marks': 91}, {'Name': 'Raj', 'Age': 21, 'Marks': 97}, {'Name': 'Rahul', 'Age': 23, 'Marks': 79}, {'Name': 'Manish' , 'Age': 23, 'Marks': 86}, ] #create dataframe df=pd.DataFrame(list_of_dict) print(df)
Output
Age Marks Name
0 25 91 Mayank
1 21 97 Raj
2 23 79 Rahul
3 23 86 Manish
Here we see that dataframe is created with default indexes 0,1,2,3….
Now a question may arise if from any dictionary key-value pair is less than other dictionaries.So in this case what happened.Let understand it with the help of an example.
import pandas as pd import numpy as np list_of_dict = [ {'Name': 'Mayank' , 'Age': 25, 'Marks': 91}, {'Name': 'Raj', 'Age': 21, 'Marks': 97}, {'Name': 'Rahul', 'Marks': 79}, {'Name': 'Manish' , 'Age': 23}, ] #create dataframe df=pd.DataFrame(list_of_dict) print(df)
Output
Age Marks Name
0 25.0 91.0 Mayank
1 21.0 97.0 Raj
2 NaN 79.0 Rahul
3 23.0 NaN Manish
Here we see in case of missing key value pair NaN value is there in the output.
-
Method 2- Create Dataframe from list of dictionary with custom indexes
Unlike the previous method where we have default indexes we can also give custom indexes by passes list of indexes in index parameter of pandas.DataFrame() function.Let see this with the help of an example.
import pandas as pd import numpy as np list_of_dict = [ {'Name': 'Mayank' , 'Age': 25, 'Marks': 91}, {'Name': 'Raj', 'Age': 21, 'Marks': 97}, {'Name': 'Rahul', 'Marks': 79}, {'Name': 'Manish' , 'Age': 23}, ] #create dataframe df=pd.DataFrame(list_of_dict,index=['a','b','c','d']) print(df)
Output
Age Marks Name
a 25.0 91.0 Mayank
b 21.0 97.0 Raj
c NaN 79.0 Rahul
d 23.0 NaN Manish
Here we see that instead of default index 1,2,3….. we have now indes a,b,c,d.
-
Method 3-Create Dataframe from list of dictionaries with changed order of columns
With the help of pandas.DataFrame() method we can easily arrange order of column by simply passes list ozf columns in columns parameter in the order in which we want to display it in our dataframe.Let see this with the help of example.
import pandas as pd import numpy as np list_of_dict = [ {'Name': 'Mayank' , 'Age': 25, 'Marks': 91}, {'Name': 'Raj', 'Age': 21, 'Marks': 97}, {'Name': 'Rahul', 'Age': 23, 'Marks': 79}, {'Name': 'Manish' , 'Age': 23, 'Marks': 86}, ] #create dataframe df=pd.DataFrame(list_of_dict,columns=['Name', 'Marks', 'Age']) print(df)
Output
Name Marks Age
0 Mayank 91 25
1 Raj 97 21
2 Rahul 79 23
3 Manish 86 23
Here also a question may arise if we pass less column in columns parameter or we pass more column in parameter then what happened.Let see this with the help of an example.
Case 1: Less column in column parameter
In this case the column which we don’t pass will be drop from the dataframe.Let see this with the help of an example.
import pandas as pd import numpy as np list_of_dict = [ {'Name': 'Mayank' , 'Age': 25, 'Marks': 91}, {'Name': 'Raj', 'Age': 21, 'Marks': 97}, {'Name': 'Rahul', 'Age': 23, 'Marks': 79}, {'Name': 'Manish' , 'Age': 23, 'Marks': 86}, ] #create dataframe df=pd.DataFrame(list_of_dict,columns=['Name', 'Marks']) print(df)
Output
Name Marks
0 Mayank 91
1 Raj 97
2 Rahul 79
3 Manish 86
Here we see that we didn’t pass Age column that’s why Age clumn is also not in our dataframe.
Case 2: More column in column parameter
In this case a new column will be added in dataframe but its all the value will be NaN.Let see this with the help of an example.
import pandas as pd import numpy as np list_of_dict = [ {'Name': 'Mayank' , 'Age': 25, 'Marks': 91}, {'Name': 'Raj', 'Age': 21, 'Marks': 97}, {'Name': 'Rahul', 'Age': 23, 'Marks': 79}, {'Name': 'Manish' , 'Age': 23, 'Marks': 86}, ] #create dataframe df=pd.DataFrame(list_of_dict,columns=['Name', 'Marks', 'Age','city']) print(df)
Output
Name Marks Age city
0 Mayank 91 25 NaN
1 Raj 97 21 NaN
2 Rahul 79 23 NaN
3 Manish 86 23 NaN
So these are the methods to create dataframe from list of dictionary in pandas.