Pandas Series drop_duplicates() Function:
Drop_duplicates python: The drop_duplicates() function of the Pandas Series returns a Series with duplicate values deleted.
Syntax:
Series.drop_duplicates(keep='first', inplace=False)
Parameters
keep: This is optional. It specifies which duplicates(if present) to keep. Possible values include:
- first- This is the Default. Except for the first occurrence, remove all duplicates.
- last – Except for the last occurrence, remove all duplicates.
- False – Remove all duplicates.
inplace: This is optional. If set to True, it does the operation in place and returns None.
Return Value:
Drop_duplicates(): A Series with duplicates removed or None if inplace=True is returned by the drop_duplicates() function of the Pandas Series.
- Python Pandas Dataframe.eval() Function
- Python Pandas DataFrame itertuples() Function
- Python Pandas Series eq() Function
Pandas Series drop_duplicates() Function in Python
Example1
Here, the drop_duplicates() function removes the duplicate values from a Series given.
Approach:
- Import pandas module using the import keyword.
- Pass some random list as an argument to the Series() function of the pandas module to create a series.
- Store it in a variable.
- Print the above-given series
- Apply drop_duplicates() function on the given series to remove all the duplicate(repeated)elements present in the given series and print the result.
-
The Exit of the Program.
Below is the implementation:
# Import pandas module using the import keyword. import pandas as pd # Pass some random list as an argument to the Series() function # of the pandas module to create a series. # Store it in a variable. gvn_series = pd.Series([1, 3, 1, 5, 4, 3, 2]) # Print the above given series print("The given series is:") print(gvn_series) print() # Apply drop_duplicates() function on the given series to remove # all the duplicate(repeated)elements present in the given series # and print the result. print("The given series after removing all the duplicates:") print(gvn_series.drop_duplicates())
Output:
The given series is: 0 1 1 3 2 1 3 5 4 4 5 3 6 2 dtype: int64 The given series after removing all the duplicates: 0 1 1 3 3 5 4 4 6 2 dtype: int64
Example2
Here, we can indicate which duplicate value to keep by using the keep argument.
Approach:
- Import pandas module using the import keyword.
- Pass some random list as an argument to the Series() function of the pandas module to create a series.
- Store it in a variable.
- Print the above-given series.
- Apply drop_duplicates() function on the given series by passing keep=’first’ as an argument to it to remove all the duplicate elements present in the given series except the first occurrence and print the result.
- Apply drop_duplicates() function on the given series by passing keep=’last’ as an argument to it to remove all the duplicate elements present in the given series except the last occurrence and print the result.
-
The Exit of the Program.
Below is the implementation:
# Import pandas module using the import keyword. import pandas as pd # Pass some random list as an argument to the Series() function # of the pandas module to create a series. # Store it in a variable. gvn_series = pd.Series([1, 3, 1, 5, 4, 3, 2]) # Print the above given series print("The given series is:") print(gvn_series) print() # Apply drop_duplicates() function on the given series by passing keep='first' # as an argument to it to remove all the duplicate elements present in the given series # except the first occurrence and print the result. print("The given series after removing all the duplicates except the first occurrence:") print(gvn_series.drop_duplicates(keep='first')) print() # Apply drop_duplicates() function on the given series by passing keep='last' # as an argument to it to remove all the duplicate elements present in the given series # except the last occurrence and print the result. print("The given series after removing all the duplicates except the last occurrence:") print(gvn_series.drop_duplicates(keep='last'))
Output:
The given series is: 0 1 1 3 2 1 3 5 4 4 5 3 6 2 dtype: int64 The given series after removing all the duplicates except the first occurrence: 0 1 1 3 3 5 4 4 6 2 dtype: int64 The given series after removing all the duplicates except the last occurrence: 2 1 3 5 4 4 5 3 6 2 dtype: int64