CSV, or comma-separated values, is a simple file format for storing tabular data. It’s a popular format for data interchange, as it’s easy to read and write and it’s supported by a wide range of software. In this article we will look at two common ways to work with CSV files in Python:
- Using the built-in
csv
library - Using the popular
pandas
data processing library
Reading CSV Files
The first step in working with a CSV file is to read it. We can do this using the csv.reader method, which returns an iterator that yields the rows of the CSV file as lists of values. Here’s an example:
import csv
with open('data.csv') as csvfile:
reader = csv.reader(csvfile)
for row in reader:
print(row)
This code opens the file data.csv, creates a csv.reader object, and then iterates over the rows of the file, printing each one. The output will look something like this:
['ID', 'Name', 'Email']
['1', 'John Doe', 'jdoe@example.com']
['2', 'Jane Smith', 'jane.smith@example.com']
Each row is represented as a list of values, with the values in the same order as they appear in the file.
Writing CSV Files
Writing to CSV files is just as easy as reading from them. We can use the csv.writer method to write data to a CSV file. Here’s an example:
import csv
data = [
['ID', 'Name', 'Email'],
['1', 'John Doe', 'jdoe@example.com'],
['2', 'Jane Smith', 'jane.smith@example.com']
]
with open('data.csv', 'w') as csvfile:
writer = csv.writer(csvfile)
for row in data:
writer.writerow(row)
This code opens the file data.csv in write mode, creates a csv.writer object, and then writes each row of data to the file. The resulting file will look like this:
ID,Name,Email
1,John Doe,jdoe@example.com
2,Jane Smith,jane.smith@example.com
Working with Different Delimiters
By default, the csv module uses the comma (,) character as the delimiter for separating values in a CSV file. But sometimes, you might encounter CSV files that use a different delimiter, such as the semicolon (;) or the pipe (|) character. In these cases, you can specify the delimiter when creating the csv.reader or csv.writer object. Here’s an example:
import csv
with open('data.csv') as csvfile:
reader = csv.reader(csvfile, delimiter=';')
for row in reader:
print(row)
This code will read a CSV file where the values are separated by semicolons instead of commas. You can use the same technique to specify a different delimiter when writing to a CSV file
Working with CSV Files using Python and Pandas
pandas
is a powerful data processing library for Python which has become an industry standard. To use it, you’ll need to install it using pip:
pip install pandas
Once you have Pandas installed, you can start using it to work with CSV files in Python. Here’s an example of how to read a CSV file and print the data to the console.
For this exercise you will need a csv
file called data.csv
in the same directory as your Python script:
import pandas as pd
# read the CSV file into a Pandas DataFrame
df = pd.read_csv("data.csv")
# print the data to the console
print(df)
This code will read the CSV file and store the data in a DataFrame, which is a powerful data structure for working with tabular data in Pandas. The print statement will print the data to the console.
Once you have the data in a DataFrame, you can start working with it. For example, you can access specific columns of data by their name:
# access the "name" column
names = df["name"]
# print the names to the console
print(names)
This code will extract the “name” column from the DataFrame and store it in a new variable called names. Then, it will print the names to the console.
You can also use Pandas to perform operations on the data, such as calculating the mean or sum of a column. Here’s an example:
# calculate the mean of the "price" column
mean_price = df["price"].mean()
# print the mean price to the console
print(mean_price)
This code will calculate the mean of the “price” column and store it in a variable called mean_price. Then, it will print the mean price to the console. This is just scratching the surface of what pandas
can do to help you to work with data in Python. It’s certainly worth exploring further if this is something which interests you.
In this post, we have seen how to work with CSV files in Python. The csv module provides a number of useful functions for reading and writing CSV data, making it a convenient choice for working with tabular data in Python. However, for more complex tasks you may want to check out the pandas
library.
Happy computing!
Recommended Python Books for Beginners
As an Amazon Associate I earn from qualifying purchases.