Python: How to delete specific lines in a file in a memory-efficient way?

How to delete specific lines in a file in a memory-efficient way ?

In this article, we will see how to delete a set of lines from a file in various ways.

We can not delete the lines from the file directly so we will first create a temporary file and write into it all the other lines. Then we will delete the original file and rename the temporary file.

We will be using the following file for demonstration

 File.txt

Line 1

Line 2

Line 3

Line 4

Line 5

Delete a line from a file by specific line number in python :

The algorithm will be-

  • Open the original file in reading mode
  • Enter the line number
  • Create a new temporary file and opening it in write mode
  • Read the contents of the file while keeping the count of the lines
    1. If the counter reaches the line we have to delete then skip this line
  • If any line was removed from the original file then delta the original file and rename the temporary file to that of the original file.
  • Else delete the temporary file.
import os
def deleteLine(originFile, lineNo):
    isSkipped = False
    index = 0
    tempFile = originFile + '.bak'
    #Open original file in read and temp file in write mode
    with open(originFile, 'r') as readObj, open(tempFile, 'w') as writeObj:
        #Copy all dat line by line
        for line in readObj:
            #When the loop counter is same as the line number skip
            if index != lineNo:
                writeObj.write(line)
            else:
                isSkipped = True
            index += 1
    #If any lines are not there rename temp as original file
    if isSkipped:
        os.remove(originFile)
        os.rename(tempFile, originFile)
    else:
        os.remove(tempFile)

#Line numbering starts from 0
deleteLine('file.txt',1)

After execution,

File.txt

Line 1

Line 3

Line 4

Line 5

Delete multiple lines from a file by line numbers :

The algorithm will be-

  • Open the original file in reading mode
  • Enter the line numbers to delete and pass it as a series
  • Create a new temporary file and opening it in write mode
  • Read the contents of the file while keeping the count of the lines
    1. If the counter reaches the numbers we have, then skip
  • If any line was removed from the original file then delta the original file and rename the temporary file to that of the original file.
  • Else delete the temporary file.
import os
def deleteLine(originFile, lineNo):
    isSkipped = False
    index = 0
    tempFile = originFile + '.bak'
    #Open original file in read and temp file in write mode
    with open(originFile, 'r') as readObj, open(tempFile, 'w') as writeObj:
        #Copy all dat line by line
        for line in readObj:
            #When the loop counter is same as the line number skip
            if index not in lineNo:
                writeObj.write(line)
            else:
                isSkipped = True
            index += 1
    #If any lines are not there rename temp as original file
    if isSkipped:
        os.remove(originFile)
        os.rename(tempFile, originFile)
    else:
        os.remove(tempFile)

#Line numbering starts from 0
deleteLine('file.txt',[0,2,3]

After execution,

file.txt

Line 2

Line 5 

Delete a specific line from the file by matching content :

The algorithm will be-

  • Open the original file in reading mode
  • Create a temporary file
  • Copy all contents from original file to the temp file line by line. If the line matches the lines we want to delete then skip
  • Compare both the files, if there are any difference delete original file and rename temp file as original.
import os
def deleteLine(originFile, lineToDelete):
    isSkipped = False
    tempFile = originFile + '.bak'
    #Open original file in read and temp file in write mode
    with open(originFile, 'r') as readObj, open(tempFile, 'w') as writeObj:
        #Copy all data line by line
        for line in readObj:
            currentLine = line
            if line[-1] == '\n':
                currentLine = line[:-1]
            # if currentLine matches with the given line then skip
            if currentLine != lineToDelete:
                writeObj.write(line)
    #If any lines are not there rename temp as original file
    if isSkipped:
        os.remove(originFile)
        os.rename(tempFile, originFile)
    else:
        os.remove(tempFile)

#Line numbering starts from 0
deleteLine('file.txt','Line 4')

After execution,

file.txt

Line 1

Line 2

Line 3

Line 5

Delete specific lines from a file that matches the given conditions :

The algorithm will be-

  • Accept the original file with a function as call-back.
  • Open the original file in reading mode
  • Enter the line numbers to delete and pass it as a series
  • Create a new temporary file and opening it in write mode
  • Read the contents of the file while keeping the count of the lines
    1. Pass each line into the function, if it returns true then skip
  • If any line was removed from the original file then deleted the original file and rename the temporary file to that of the original file.
  • Else delete the temporary file.
import os
def deleteLine(originFile, conditionalFunc):
    isSkipped = False
    tempFile = originFile + '.bak'
    #Open original file in read and temp file in write mode
    with open(originFile, 'r') as readObj, open(tempFile, 'w') as writeObj:
        #Copy all data line by line
        for line in readObj:
#Chech each file by passing it into the function
            if conditionalFunc(line) == False:
                writeObj.write(line)
    #If any lines are not there rename temp as original file
    if isSkipped:
        os.remove(originFile)
        os.rename(tempFile, originFile)
    else:
        os.remove(tempFile)

#Line numbering starts from 0
deleteLine('file.txt',conditionalFunction)

We can pass any function to check for our condition and when they are met, the lines are skipped.. The conditions can be anything like size of the line, line with a particular word in it etc.