Drop_duplicates python – Python Pandas Series drop_duplicates() Function

Pandas Series drop_duplicates() Function:

Drop_duplicates python: The drop_duplicates() function of the Pandas Series returns a Series with duplicate values deleted.

Syntax:

Series.drop_duplicates(keep='first',  inplace=False)

Parameters

keep:  This is optional. It specifies which duplicates(if present) to keep. Possible values include:

  • first- This is the Default. Except for the first occurrence, remove all duplicates.
  • last – Except for the last occurrence, remove all duplicates.
  • False – Remove all duplicates.

inplace: This is optional. If set to True, it does the operation in place and returns None.

Return Value:

Drop_duplicates(): A Series with duplicates removed or None if inplace=True is returned by the drop_duplicates() function of the Pandas Series.

Pandas Series drop_duplicates() Function in Python

Example1

Here, the drop_duplicates() function removes the duplicate values from a Series given.

Approach:

  • Import pandas module using the import keyword.
  • Pass some random list as an argument to the Series() function of the pandas module to create a series.
  • Store it in a variable.
  • Print the above-given series
  • Apply drop_duplicates() function on the given series to remove all the duplicate(repeated)elements present in the given series and print the result.
  • The Exit of the Program.

Below is the implementation:

# Import pandas module using the import keyword.
import pandas as pd
# Pass some random list as an argument to the Series() function
# of the pandas module to create a series.
# Store it in a variable.
gvn_series = pd.Series([1, 3, 1, 5, 4, 3, 2])
# Print the above given series
print("The given series is:")
print(gvn_series)
print()
# Apply drop_duplicates() function on the given series to remove 
# all the duplicate(repeated)elements present in the given series 
# and print the result.
print("The given series after removing all the duplicates:")
print(gvn_series.drop_duplicates())

Output:

The given series is:
0    1
1    3
2    1
3    5
4    4
5    3
6    2
dtype: int64

The given series after removing all the duplicates:
0    1
1    3
3    5
4    4
6    2
dtype: int64

Example2

Here, we can indicate which duplicate value to keep by using the keep argument.

Approach:

  • Import pandas module using the import keyword.
  • Pass some random list as an argument to the Series() function of the pandas module to create a series.
  • Store it in a variable.
  • Print the above-given series.
  • Apply drop_duplicates() function on the given series by passing keep=’first’ as an argument to it to remove all the duplicate elements present in the given series except the first occurrence and print the result.
  • Apply drop_duplicates() function on the given series by passing keep=’last’ as an argument to it to remove all the duplicate elements present in the given series except the last occurrence and print the result.
  • The Exit of the Program.

Below is the implementation:

# Import pandas module using the import keyword.
import pandas as pd
# Pass some random list as an argument to the Series() function
# of the pandas module to create a series.
# Store it in a variable.
gvn_series = pd.Series([1, 3, 1, 5, 4, 3, 2])
# Print the above given series
print("The given series is:")
print(gvn_series)
print()
# Apply drop_duplicates() function on the given series by passing keep='first'
# as an argument to it to remove all the duplicate elements present in the given series 
# except the first occurrence and print the result.
print("The given series after removing all the duplicates except the first occurrence:")
print(gvn_series.drop_duplicates(keep='first'))
print()
# Apply drop_duplicates() function on the given series by passing keep='last'
# as an argument to it to remove all the duplicate elements present in the given series 
# except the last occurrence and print the result.
print("The given series after removing all the duplicates except the last occurrence:")
print(gvn_series.drop_duplicates(keep='last'))

Output:

The given series is:
0    1
1    3
2    1
3    5
4    4
5    3
6    2
dtype: int64

The given series after removing all the duplicates except the first occurrence:
0    1
1    3
3    5
4    4
6    2
dtype: int64

The given series after removing all the duplicates except the last occurrence:
2    1
3    5
4    4
5    3
6    2
dtype: int64