| by Arround The Web | No comments

XLSX to CSV in Python

Python is used with various file extensions, including XLSX and CSV. The “XLSX” file extension is used to store information on Microsoft Excel spreadsheets, and the Comma Separated Value data is stored in a CSV file. While working with these files, sometimes users need to convert the XLSX file into CSV for data manipulation operations. To achieve/accomplish this, various methods are utilized in Python.

This write-up will present/deliver a detailed guide on XLSX to CSV files utilizing numerous examples.

How to Convert/Transform Python XLSX to CSV?

To convert XLSX to CSV, the following methods are used in Python:

Method 1: Convert XLSX to CSV Using the “to_csv()” Method in Python

The “to_csv()” method is utilized in Python to write objects into CSV (comma-separated value).

Syntax

DataFrame.to_csv(path_or_buf=None, sep=',', na_rep='', float_format=None, columns=None, header=True, index=True, index_label=None, mode='w', encoding=None, compression='infer', quoting=None, quotechar='"', lineterminator=None, chunksize=None, date_format=None, doublequote=True, escapechar=None, decimal='.', errors='strict', storage_options=None)

For a detailed understanding of this syntax and method, you can check the article named Pandas Export to CSV.

We can also use this method to convert the XLSX file into a CSV file. Let’s understand this via the below examples.

Example 1: Convert a Single XLSX File to a CSV File

In this example, the following “XLSX” file name “new.xlsx” will be converted into a CSV file using the “df.to_csv()” function.

This file contains the following data:

Now, to convert the “XLSX” file into a “CSV” file, we first are required to read the Excel file using the “pandas.read_excel()” method to retrieve the Pandas DataFrame. After that, use the “file.to_csv()” method to convert the DataFrame into CSV and save it into a “sample.csv” file. Take the following/below code as an example:

import pandas
file = pandas.read_excel("new.xlsx")
file.to_csv("sample.csv",index = None,header=True)

The XLSX file has been converted into a CSV file successfully. Here is the CSV data:

The below properties snippet verified that the file had been converted:

Example 2: Convert a Multiple XLSX File to a CSV File

We can also convert the multiple XLSX files into CSV files utilizing the “.to_csv()” method along with other methods. The following snippet shows the multiple XLSX file:

In the below code, the source and destination directory path of the multiple XLSX file is assigned to the variables “file1” and “file2”. After that, the for loop is iterated over the enumerate function with the “os.listdir()” method that contains all the XLSX files. Next, the “pandas.read_excel()” reads these XLSX files and converts them into DataFrame. At last, these DataFrames are converted into a CSV file using the “.to_csv()” method.

import pandas, os
file1 = r"C:\Users\p\Documents\program"
file2 = r"C:\Users\p\Documents\program"
for count, f_name in enumerate(os.listdir(file1)):
    file3 = pandas.read_excel(os.path.join(file1,f_name))
    z = os.path.join(file2, str('excel_file ') + str(count) + str('.csv'))
    file3.to_csv(z, index = None, header=True)
print("converted to csv!")

The multiple XLSX files have been converted into CSV:

Method 2: Convert XLSX to CSV Utilizing Openpyxl and CSV Modules

The “openpyxl” module can also be employed along with the “csv” module to transform the XLSX file into a CSV file. But, before moving towards the code example, you need to install the “openpyxl” in Python using the below command:

pip install openpyxl

Here, in this code, the “openpyxl.load_workbook()” method is used to access an Excel file, and the “wb.active” attribute is utilized to activate the sheet of the XLSX file. The “open()” function is utilized to open the CSV file and create the writer object. Next, the “for loop” is utilized to write the rows of the XLSX file to the CSV file utilizing the “writerow()” method. Take the below code as an example:

import openpyxl
import csv
wb = openpyxl.load_workbook(filename="new.xlsx")
ws = wb.active
with open("sample.csv", "w", newline="") as f:
    c = csv.writer(f)
    for row in ws.rows:
        c.writerow([cell.value for cell in row])

The specified XLSX file has been converted into a CSV file successfully:

Conclusion

The “to_csv()” method of the “pandas” module and “Openpyxl” and “CSV” modules are used to convert XLSX to CSV in Python. The “to_csv()” method can be used to convert single or multiple XLSX files into CSV files. This guide presented a detailed overview of XLSX to CSV using several examples.

Share Button

Source: linuxhint.com

Leave a Reply