Python: Read a CSV file line by line with or without header

In this article, we will be learning about how to read a CSV file line by line with or without a header. Along with that, we will be learning how to select a specified column while iterating over a file.

Let us take an example where we have a file named students.csv.

Id,Name,Course,City,Session
21,Mark,Python,London,Morning
22,John,Python,Tokyo,Evening
23,Sam,Python,Paris,Morning
32,Shaun,Java,Tokyo,Morning
What we want is to read all the rows of this file line by line.
Note that, we will not be reading this CSV file into lists of lists because that will be very space-consuming and time-consuming. It will also cause problems with the large data. We have to look for a solution that works as an interpreter where we can read a line one at a time so that less memory consumption will take place.
Let’s get started with it!
In python, we have two modules to read the CSV file, one is csv.reader and the second is csv.DictReader. We will use them one by one to read a CSV file line by line.

Read a CSV file line by line using csv.reader

By using the csv.readerĀ module, a reader class object is made through which we can iterate over the lines of a CSV file as a list of values, where each value in the list is a cell value.

Read a CSV file line by line using csv.reader

Code:

from csv import reader
# open file in read mode
with open('students.csv', 'r') as read_obj:
 # pass the file object to reader() to get the reader object
csv_reader = reader(read_obj)
# Iterate over each row in the csv using reader object
for row in csv_reader:
# row variable is a list that represents a row in csv
print(row)
The above code iterated over each row of the CSV file. It fetched the content of each row as a list and printed that generated list.

How did it work?

It performed a few steps:

  1. Opened the students.csv file and created a file object.
  2. In csv.reader() function, the reader object is created and passed.
  3. Now with the reader object, we iterated it by using the for loop so that it can read each row of the csv as a list of values.
  4. At last, we printed this list.

By using this module, only one line will consume memory at a time while iterating through a csv file.

Read csv file without header

What if we want to skip a header and print the files without the header. In the previous example, we printed the values including the header but in this example, we will remove the header and print the values without the header.

Read csv file without header

Code:

from csv import reader
# skip first line i.e. read header first and then iterate over each row od csv as a list
with open('students.csv', 'r') as read_obj:
csv_reader = reader(read_obj)
header = next(csv_reader)
# Check file as empty
if header != None:
# Iterate over each row after the header in the csv
for row in csv_reader:
# row variable is a list that represents a row in csv
print(row)
We can see in the image above, that the header is not printed and the code is designed in such a way that it skipped the header and printed all the other values in a list.

Read csv file line by line using csv module DictReader object

Now, we will see the example using csv.DictReader module. CSV’s module dictReader object class iterates over the lines of a CSV file as a dictionary, which means for each row it returns a dictionary containing the pair of column names and values for that row.

Read csv file line by line using csv module DictReader object

Code:

from csv import DictReader
# open file in read mode
with open('students.csv', 'r') as read_obj:
# pass the file object to DictReader() to get the DictReader object
csv_dict_reader = DictReader(read_obj)
# iterate over each line as a ordered dictionary
for row in csv_dict_reader:
# row variable is a dictionary that represents a row in csv
print(row)
The above code iterated over all the rows of the CSV file. It fetched the content of the row for each row and put it as a dictionary.

How did it work?

It performed a few steps:

  1. Opened the students.csv file and created a file object.
  2. In csv.DictReader() function, the reader object is created and passed.
  3. Now with the reader object, we iterated it by using the for loop so that it can read each row of the csv as a dictionary of values. Where each pair in this dictionary represents contains the column name & column value for that row.

It also saves the memory as only one row at a time is in the memory.

Get column names from the header in the CSV file

We have a member function in the DictReader class that returns the column names of a csv file as a list.

Get column names from the header in the CSV fileCode:

from csv import DictReader
# open file in read mode
with open(‘students.csv’, ‘r’) as read_obj:
# pass the file object to DictReader() to get the DictReader object
csv_dict_reader = DictReader(read_obj)
# get column names from a csv file
column_names = csv_dict_reader.fieldname
print(column_names)

Read specific columns from a csv file while iterating line by line

Read specific columns (by column name) in a CSV file while iterating row by row

We will iterate over all the rows of the CSV file line by line but will print only two columns of each row.
Read specific columns (by column name) in a csv file while iterating row by row
Code:
from csv import DictReader
# iterate over each line as a ordered dictionary and print only few column by column name
withopen('students.csv', 'r')as read_obj:
csv_dict_reader = DictReader(read_obj)
for row in csv_dict_reader:
print(row['Id'], row['Name'])
DictReader returns a dictionary for each line during iteration. As in this dictionary, keys are column names and values are cell values for that column. So, for selecting specific columns in every row, we used column name with the dictionary object.

Read specific columns (by column Number) in a CSV file while iterating row by row

We will iterate over all the rows of the CSV file line by line but will print the contents of the 2nd and 3rd column.

Read specific columns (by column Number) in a csv file while iterating row by row

Code:

from csv import reader
# iterate over each line as a ordered dictionary and print only few column by column Number
with open('students.csv', 'r') as read_obj:
csv_reader = reader(read_obj)
for row in csv_reader:
print(row[1], row[2])
With the csv.reader each row of the csv file is fetched as a list of values, where each value represents a column value. So, selecting the 2nd & 3rd column for each row, select elements at index 1 and 2 from the list.
The complete code:
from csv import reader
from csv import DictReader
def main():
print('*** Read csv file line by line using csv module reader object ***')
print('*** Iterate over each row of a csv file as list using reader object ***')
# open file in read mode
with open('students.csv', 'r') as read_obj:
# pass the file object to reader() to get the reader object
csv_reader = reader(read_obj)
# Iterate over each row in the csv using reader object
for row in csv_reader:
# row variable is a list that represents a row in csv
print(row)
print('*** Read csv line by line without header ***')
# skip first line i.e. read header first and then iterate over each row od csv as a list
with open('students.csv', 'r') as read_obj:
csv_reader = reader(read_obj)
header = next(csv_reader)
# Check file as empty
if header != None:
# Iterate over each row after the header in the csv
for row in csv_reader:
# row variable is a list that represents a row in csv
print(row)
print('Header was: ')
print(header)
print('*** Read csv file line by line using csv module DictReader object ***')
# open file in read mode
with open('students.csv', 'r') as read_obj:
# pass the file object to DictReader() to get the DictReader object
csv_dict_reader = DictReader(read_obj)
# iterate over each line as a ordered dictionary
for row in csv_dict_reader:
# row variable is a dictionary that represents a row in csv
print(row)
print('*** select elements by column name while reading csv file line by line ***')
# open file in read mode
with open('students.csv', 'r') as read_obj:
# pass the file object to DictReader() to get the DictReader object
csv_dict_reader = DictReader(read_obj)
# iterate over each line as a ordered dictionary
for row in csv_dict_reader:
# row variable is a dictionary that represents a row in csv
print(row['Name'], ' is from ' , row['City'] , ' and he is studying ', row['Course'])
print('*** Get column names from header in csv file ***')
# open file in read mode
with open('students.csv', 'r') as read_obj:
# pass the file object to DictReader() to get the DictReader object
csv_dict_reader = DictReader(read_obj)
# get column names from a csv file
column_names = csv_dict_reader.fieldnames
print(column_names)
print('*** Read specific columns from a csv file while iterating line by line ***')
print('*** Read specific columns (by column name) in a csv file while iterating row by row ***')
# iterate over each line as a ordered dictionary and print only few column by column name
with open('students.csv', 'r') as read_obj:
csv_dict_reader = DictReader(read_obj)
for row in csv_dict_reader:
print(row['Id'], row['Name'])
print('*** Read specific columns (by column Number) in a csv file while iterating row by row ***')
# iterate over each line as a ordered dictionary and print only few column by column Number
with open('students.csv', 'r') as read_obj:
csv_reader = reader(read_obj)
for row in csv_reader:
print(row[1], row[2])
if __name__ == '__main__':
main()

I hope you understood this article as well as the code.

Happy reading!

Leave a Comment