Given a text file, the task is to find the most repeated word in the given text file in Python.
Program to Find the Most Repeated Word in a Text File
Approach:
- import the Counter from the collections module using the import keyword.
-
Make a single variable to store the path of the file. This is a constant value. This value must be replaced with the file path from your own system in the example below.
-
Take an empty list to store all the words.
-
Open the file in read-only mode. In this case, we’re simply reading the contents of the file.
-
Iterate through the lines of the file using the For loop.
-
Split the words of the line using the split() function and store them in a variable(it is of type list).
-
Loop in the above list using another Nested For loop.
-
Add the words to the list using the append() function.
-
Get the frequency of all the words using the Counter() function and store it in a variable(of type dictionary).
-
Take a variable to store maximum frequency and initialize its value to 0.
-
Traverse in the frequency dictionary using the For loop
-
Check if the frequency of the word is greater than maximumfreq using the if conditional statement
-
If it is true then set maximumfreq to the corresponding value of the key.
-
Take a variable to store the maximum frequency key.
-
Print the maximum frequency element.
Below is the implementation:
# import the Counter from collections module using the import keyword from collections import Counter # Make a single variable to store the path of the file. This is a constant value. # This value must be replaced with the file path from your own system in the example below. givenFilename = "samplefile.txt" # Take a empty list to store all the words l = [] # Open the file in read-only mode. In this case, we're simply reading the contents of the file. with open(givenFilename, 'r') as givenfilecontent: # Iterate through the lines of the file using the For loop. print('The words in the given file : ') for gvnfileline in givenfilecontent: # Split the words of the line using the split() function and store them in a variable(it is of type list). gvnfilewords = gvnfileline.split() # Loop in the above list using another Nested For loop for words in gvnfilewords: # Add the words to the list using the append() function. l.append(words) # Get frequency of all the words using the Counter() function and store it in a variable(of type dictionary) freqword = Counter(l) # Take a variable to store maximum frequency maximumfreq = 0 # Traverse in the frequency dictionary using the For loop for i in freqword: # Check if the frequency of the word is greater than maximumfreq using the if conditional statement if(freqword[i] > maximumfreq): # If it is true then set maximumfreq to the corresponding value of the key maximumfreq = freqword[i] # Take a variable to store the maximum frequency key maxele = i # Print the maximum frequency element print('The Maximum Frequency element in the given Text File {', maxele, '}')
Output:
The words in the given file : The Maximum Frequency element in the given Text File { btechgeeks }
- Python Program to Count Words in a Text File those are Ending with Alphabet “e”
- Python Program to Reverse Each Word in a Text File
- Python Program to Capitalize the First Letter of Every Word in the File
Explanation:
- The file path is stored in the variable ‘file name.’ Change the value of this variable to the path of your own file.
- Dragging and dropping a file onto the terminal will show its path. The code will not run unless you change the value of this variable.
- The file will be opened in reading mode. Use the open() function to open a file. The path to the file is the method’s first parameter, and the mode to open the file is the method’s second parameter.
- When we open the file, we use the character ‘r’ to signify read-mode.
- The split() method separates all of the words in the given file.
- We add all the words to the list.
- We apply the Counter() function to get the frequency of all the words of the given text file.
Samplefile.txt:
hello this is btechgeeks hello good morning this is btechgeeks btechgeeks btechgeeks
Sample Implementation in google colab: