OpenPyXL - Working with CSV Files

Welcome to the second part in this series. This isn't covering features of OpenPyXL directly. But since a lot of the data brought into Excel is from CSV files. I thought we should get this out of the way. OpenPyXL doesn't load CSV files directly. What we can do is use the Python csv module to pull in the csv file, iterate over the data and create a workbook out of it. We can do a reverse process to go from Excel into a CSV file.

It's possible to load the CSV files manually into Excel and save as XLSX file first, but since we are trying to automate our work... 😎

These are the packages we'll need for either direction we're using.

import csv
import openpyxl

CSV to Excel

wb will be used to store a newly created OpenPyXL Workbook. ws will point to the active sheet in the new workbook.

wb = openpyxl.Workbook()
ws = wb.active

In this first version, we're adding each csv row as a row in the worksheet. The advantage of this version is it's a bit more straight forward and only a single loop which will make this a bit quicker.

The delimiter=',' option on the csv.reader() can be changed to whatever delimiter the csv file is using. The default is ,, so in this case, it could have been left out.

with open('classics.csv') as f:
    reader = csv.reader(f, delimiter=',')

    for row in reader:
        ws.append(row)

If we need a bit more control over the data going into each cell, we can loop over each row and place the values into each cell.

The outer for loop goes over each line of the CSV file. Using enumerate an index will be provided for each row. Most things in programming starts at zero, but a workbook starts at 1 for each index, row and column. We'll use the start=1 option so we don't need to use a +1. 😁

with open('classics.csv') as f:
  reader = csv.reader(f, delimiter=',')

  for row_index, row in enumerate(reader, start=1):
    for column_index, cell_value in enumerate(row, start=1):
      ws.cell(row=row_index, column=column_index).value=cell_value

The inner for loop will iterate row providing a cell value. Again, we're using enumerate to have an index.

The ws.cell() method will put the data from the CSV into each cell. We have the option here to modify the value before placing it in the cell.

At this point we have the CSV data in a OpenPyXL workbook. This workbook can be styled as needed before we finally save the workbook as an XLSX file.

wb.save('file.xlsx')

Excel to CSV

Keep in mind CSV doesn't support any formatting and only has a single sheet. If the Excel file has multiple sheets, you'll need to decide whether to combine them into a single csv file, use only 1 sheet from the workbook or create multiple CSV files.

wb will store the existing Excel workbook. ws will be the sheet we'll be putting in the CSV file. We're selecting the sheet that is the default active sheet. This is the sheet that opens when Excel opens the file. Normally this is the first one but doesn't have to be.

wb = openpyxl.load_workbook(filename='data.xlsx')

ws = workbook.active

Using a List Comprehension, we're creating a list based on each row on the sheet. The csv_data will be a list of the rows.

csv_data = [
  list(value)
  for value in ws.iter_rows(values_only=True)
]

ws.iter_rows(values_only=True) provides an iterable that provides a row object. Passing it to list() will create a list.

Using an open file, we'll use the csv.writer() to load the csv data row by row.

with open('data.csv', 'w') as file:
    writer = csv.writer(file, delimiter=',')
    for line in csv_data:
        writer.writerow(line)

Up Next

We'll look at loading an Excel file and do some basic things with a workbook.

Besides CSV and Excel files, are there other data sources you use when creating Excel reports? Let me know in the comments, I might do another post about pulling that in using Python.

Resources

ReadTheDocs : Openpyxl

pypi : Openpyxl

Python.org - CSV Package

OpenPyXL - Working with CSV Files

Part 2

CSV to Excel

Excel to CSV

Up Next

Resources

Did you find this article valuable?