Python Program to Find the Most Repeated Word in a Text File

Given a text file, the task is to find the most repeated word in the given text file in Python.

Program to Find the Most Repeated Word in a Text File

Approach:

  • import the Counter from the collections module using the import keyword.
  • Make a single variable to store the path of the file. This is a constant value. This value must be replaced with the file path from your own system in the example below.
  • Take an empty list to store all the words.
  • Open the file in read-only mode. In this case, we’re simply reading the contents of the file.
  • Iterate through the lines of the file using the For loop.
  • Split the words of the line using the split() function and store them in a variable(it is of type list).
  • Loop in the above list using another Nested For loop.
  • Add the words to the list using the append() function.
  • Get the frequency of all the words using the Counter() function and store it in a variable(of type dictionary).
  • Take a variable to store maximum frequency and initialize its value to 0.
  • Traverse in the frequency dictionary using the For loop
  • Check if the frequency of the word is greater than maximumfreq using the if conditional statement
  • If it is true then set maximumfreq to the corresponding value of the key.
  • Take a variable to store the maximum frequency key.
  • Print the maximum frequency element.

Below is the implementation:

# import the Counter from collections module using the import keyword
from collections import Counter
# Make a single variable to store the path of the file. This is a constant value.
# This value must be replaced with the file path from your own system in the example below.
givenFilename = "samplefile.txt"
# Take a empty list to store all the words
l = []
# Open the file in read-only mode. In this case, we're simply reading the contents of the file.
with open(givenFilename, 'r') as givenfilecontent:
    # Iterate through the lines of the file using the For loop.
    print('The words in the given file : ')
    for gvnfileline in givenfilecontent:
      # Split the words of the line using the split() function and store them in a variable(it is of type list).
        gvnfilewords = gvnfileline.split()
        # Loop in the above list using another Nested For loop
        for words in gvnfilewords:
          # Add the words to the list using the append() function.
            l.append(words)
# Get frequency of all the words using the Counter() function and store it in a variable(of type dictionary)
freqword = Counter(l)
# Take a variable to store maximum frequency
maximumfreq = 0
# Traverse in the frequency dictionary using the For loop
for i in freqword:
    # Check if the frequency of the word is greater than maximumfreq using the if conditional statement
    if(freqword[i] > maximumfreq):
        # If it is true then set maximumfreq to the corresponding value of the key
        maximumfreq = freqword[i]
        # Take a variable to store the maximum frequency key
        maxele = i
# Print the maximum frequency element
print('The Maximum Frequency element in the given Text File {', maxele, '}')

Output:

The words in the given file : 
The Maximum Frequency element in the given Text File { btechgeeks }

Explanation:

  • The file path is stored in the variable ‘file name.’ Change the value of this variable to the path of your own file.
  • Dragging and dropping a file onto the terminal will show its path. The code will not run unless you change the value of this variable.
  • The file will be opened in reading mode. Use the open() function to open a file. The path to the file is the method’s first parameter, and the mode to open the file is the method’s second parameter.
  • When we open the file, we use the character ‘r’ to signify read-mode.
  • The split() method separates all of the words in the given file.
  • We add all the words to the list.
  • We apply the Counter() function to get the frequency of all the words of the given text file.

Samplefile.txt:

hello this is btechgeeks
hello good morning 
this is btechgeeks
btechgeeks btechgeeks

Sample Implementation in google colab: