Find Frequency of Each Character in String and their Indices and Finding Duplicate Characters in a String

String:

A string is a group of alphabets, words, or other characters. It is a primitive data structure that serves as the foundation for data manipulation. Python has a string class called str. Strings in Python are “immutable,” which means they can’t be modified once they’re formed. Because of the immutability of strings, we generate new strings as we go to represent computed values.

Given a string, the task is to find frequency of each character in a string .

Examples:

Input:

string = "hello this is btech geeks online learning platform for underguate students"

Output:

Frequency of each character of the string is :
The frequency of character h is = 3
The frequency of character e is = 9
The frequency of character l is = 5
The frequency of character o is = 4
The frequency of character   is = 10
The frequency of character t is = 6
The frequency of character i is = 4
The frequency of character s is = 5
The frequency of character b is = 1
The frequency of character c is = 1
The frequency of character g is = 3
The frequency of character k is = 1
The frequency of character n is = 6
The frequency of character a is = 3
The frequency of character r is = 4
The frequency of character p is = 1
The frequency of character f is = 2
The frequency of character m is = 1
The frequency of character u is = 3
The frequency of character d is = 2

Finding count of Each Character in a String and their Indices and Finding Duplicate Characters in a given String

There are several ways to find frequency of each character in a string some of them are:

1)Using Counter() function  to print frequency of each character in given string

Counter is a subclass of dict and a set. Counter() takes an iterable object as an argument and stores its elements as keys and their frequencies as values. As a result, if we transfer a string in collections. Counter() will then return a Counter class object with all characters in the string as keys and their frequency in the string as values.

Below is the implementation:

# importing Counter function from collections
from collections import Counter
# Given string
string = "hello this is btech geeks online learning platform for underguate students"
# Using counter() function to calculate frequency of each character of the string
freq = Counter(string)
print("Frequency of each character of the string is :")
# Traverse the freq dictionary and print their respective count
for key in freq:
    print("The frequency of character", key, "is =", freq[key])

Output:

Frequency of each character of the string is :
The frequency of character h is = 3
The frequency of character e is = 9
The frequency of character l is = 5
The frequency of character o is = 4
The frequency of character   is = 10
The frequency of character t is = 6
The frequency of character i is = 4
The frequency of character s is = 5
The frequency of character b is = 1
The frequency of character c is = 1
The frequency of character g is = 3
The frequency of character k is = 1
The frequency of character n is = 6
The frequency of character a is = 3
The frequency of character r is = 4
The frequency of character p is = 1
The frequency of character f is = 2
The frequency of character m is = 1
The frequency of character u is = 3
The frequency of character d is = 2

2)Using regex to find frequency and indices of all characters in a string

We will construct a regex pattern to fit all of the alphanumeric characters in the string,

Make a Regex pattern that matches alphanumeric characters.

regex_Pattern = re.compile('[a-zA-Z0-9]')

Iterate over all of the above-mentioned pattern matches in the string using pattern.

finditer() and generate dictionaries of each character’s frequency count and index position in the string.

Below is the implementation:

import re
# Given string
string = "hello this is btech geeks online learning platform for underguate students"
# regex pattern
regex_Pattern = re.compile('[a-zA-Z0-9]')
# Iterate through the string's alphanumeric characters which matches the regex pattern
# While iterating, keep the frequency count of each character in a dictionary updated.
matchiterator = regex_Pattern.finditer(string)
charfrequency = {}
indices = {}
for matchchar in matchiterator:
    charfrequency[matchchar.group()] = charfrequency.get(
        matchchar.group(), 0) + 1
    indices[matchchar.group()] = indices.get(
        matchchar.group(), []) + [matchchar.start()]
print("Frequency and indices of each character in the string is :")
# Traverse the charfrquency dictionary and print their respective count and indices
for key in charfrequency:
    print("The frequency of character", key, "is =",
          charfrequency[key], " ; Indices of occurrence = ", indices[key])

Output:

Frequency and indices of each character in the string is :
The frequency of character h is = 3  ; Indices of occurrence =  [0, 7, 18]
The frequency of character e is = 9  ; Indices of occurrence =  [1, 16, 21, 22, 31, 34, 58, 64, 70]
The frequency of character l is = 5  ; Indices of occurrence =  [2, 3, 28, 33, 43]
The frequency of character o is = 4  ; Indices of occurrence =  [4, 26, 47, 52]
The frequency of character t is = 6  ; Indices of occurrence =  [6, 15, 45, 63, 67, 72]
The frequency of character i is = 4  ; Indices of occurrence =  [8, 11, 29, 38]
The frequency of character s is = 5  ; Indices of occurrence =  [9, 12, 24, 66, 73]
The frequency of character b is = 1  ; Indices of occurrence =  [14]
The frequency of character c is = 1  ; Indices of occurrence =  [17]
The frequency of character g is = 3  ; Indices of occurrence =  [20, 40, 60]
The frequency of character k is = 1  ; Indices of occurrence =  [23]
The frequency of character n is = 6  ; Indices of occurrence =  [27, 30, 37, 39, 56, 71]
The frequency of character a is = 3  ; Indices of occurrence =  [35, 44, 62]
The frequency of character r is = 4  ; Indices of occurrence =  [36, 48, 53, 59]
The frequency of character p is = 1  ; Indices of occurrence =  [42]
The frequency of character f is = 2  ; Indices of occurrence =  [46, 51]
The frequency of character m is = 1  ; Indices of occurrence =  [49]
The frequency of character u is = 3  ; Indices of occurrence =  [55, 61, 68]
The frequency of character d is = 2  ; Indices of occurrence =  [57, 69]

3)Using Counter to find Duplicate characters in the given string

Now, use collections to find all of the duplicate characters in this string. Counter() is used to determine the frequency of of character in a string, and characters with a frequency greater than 1 are considered duplicates.

Below is the implementation:

# importing Counter function from collections
from collections import Counter
# Given string
string = "hello this is btech geeks online learning platform for underguate students"
# Using counter() function to calculate frequency of each character of the string
freq = Counter(string)
print("Printing duplicate characters in the given string :")
# Traverse the freq dictionary and print the duplicate characters
for key in freq:
    # if the freq of character is greater than 1 then it is duplicate character so we print it
    if(freq[key] > 1):
        print(key)

Output:

Printing duplicate characters in the given string :
h
e
l
o
 
t
i
s
g
n
a
r
f
u
d

Related Programs: