Python provides many ways for reading and writing data to CSV files. Among all the different ways to read a CSV file in Python, the standard csv module and
pandas
library provide simplistic and straightforward methods. As with a simple text file, we can also use Python file handling and the
open()
method to read a CSV file in Python.
In this Python tutorial, we will walk discuss how to use the CSV module and Pandas library for reading and writing data to CSV files. And by the end of this tutorial, you will have a solid idea about what is a CSV file and how to handle CSV files in Python. So, let's start.
What is a CSV File?
A CSV, a.k.a.
Comma Separated Values
file, is a simple text file. It has the
.csv
file extension and hence, the name. But unlike a text file, the data inside the CSV file must be organized in a specific format. The data in the CSV file should be stored in a tabular format, and as the name suggests, the data values inside the CSV files must be separated by commas. Like tabular data of
relational databases
, every row or line of the CSV file represents a record, and every column represents a specific data field. Consider the following example of a CSV file:
#movies.csv
movieId,title,genres
1,Toy Story (1995),Adventure|Animation|Children|Comedy|Fantasy
2,Jumanji (1995),Adventure|Children|Fantasy
3,Grumpier Old Men (1995),Comedy|Romance
4,Waiting to Exhale (1995),Comedy|Drama|Romance
5,Father of the Bride Part II (1995),Comedy
6,Heat (1995),Action|Crime|Thriller
7,Sabrina (1995),Comedy|Romance
A CSV file can also be opened using MS Excel , and there you can see a proper representation of the CSV data.
From the above
movies.csv
file, you can see that every data value in a column is separated with a comma, and every new record is terminated with a new line. Next, let's discuss how we can read and write data in a CSV file in Python.
Python CSV Module
Python comes with a powerful standard CSV module for reading and writing CSV files. To use the dedicated
csv
module, we have to import it first using the following Python import statement:
import csv
Create a CSV file in Python and Write Data
Let's start by creating a CSV file using Python and writing some data in it. Although we can simply use the Python file handling
write()
method to write data in a CSV file, here we will be using
csv.writer()
and
csv.writerow()
methods to write data row by row.
Example: Write a CSV File in Python
import csv
#open or create file
with open("movies.csv", 'w', newline="") as file:
writer = csv.writer(file)
#write data
writer.writerow(["movieId", "title", "genres"])
writer.writerow(["1","Toy Story (1995)","Adventure|Animation|Children|Comedy|Fantasy"])
writer.writerow(["2","Jumanji (1995)","Adventure|Children|Fantasy"])
writer.writerow(["3","Grumpier Old Men (1995)","Comedy|Romance"])
writer.writerow(["4","Waiting to Exhale (1995)","Comedy|Drama|Romance"])
From the above example you can see that in order to write a CSV file in Python, first you need to open it using the open() method. When you execute the above program, it will create a movies.csv file in the same directory where your Python script is located.
#movies.csv
movieId,title,genres
1,Toy Story (1995),Adventure|Animation|Children|Comedy|Fantasy
2,Jumanji (1995),Adventure|Children|Fantasy
3,Grumpier Old Men (1995),Comedy|Romance
4,Waiting to Exhale (1995),Comedy|Drama|Romance
In the above example, you can see that when we open the file using the
open("movies.csv", 'w', newline="")
statement, we also specify the
newline =""
parameter, and it specifies that there should be no newline gap between two records.
Write CSV Data in Python Using the writerows() Method
In the above example, we write data in our
movies.csv
file using the
writerow()
method. When we use the
writerow()
method to write the data, we have to use it multiple times because it writes data row by row. However, there is a better way to do it. The
csv.writer()
module also provides the
writer.writerows()
method, which can write multiple data rows in the CSV file with just one call.
Python Example:
Write Multiple Rows in a csv File with writerows()
Let's continue with our above example and append new rows of movie data in our
movies.csv
file using the
writer.writerows()
method.
import csv
movies_rows = [
["5","Father of the Bride Part II (1995)","Comedy"],
["6","Heat (1995)","Action|Crime|Thriller"],
["7","Sabrina (1995)","Comedy|Romance"]
]
#append data to movies.csv
with open("movies.csv", 'a', newline="") as file:
writer = csv.writer(file)
#write multiple rows
writer.writerows(movies_rows)
In this example, we append new data to our movies.csv file by opening the file in the
"a"
append mode, and when you execute this program, your movies.csv file will be populated with 3 more rows.
movieId,title,genres
1,Toy Story (1995),Adventure|Animation|Children|Comedy|Fantasy
2,Jumanji (1995),Adventure|Children|Fantasy
3,Grumpier Old Men (1995),Comedy|Romance
4,Waiting to Exhale (1995),Comedy|Drama|Romance
5,Father of the Bride Part II (1995),Comedy
6,Heat (1995),Action|Crime|Thriller
7,Sabrina (1995),Comedy|Romance
Note:
The default delimiter of csv.writer() is the comma, which makes sense for the comma-separated values file, but if you want to set the delimiter to some other symbol like $, > or <, then you can specify the delimiter parameter to the writer() method.
writer = csv.writer(file, delimiter= ">")
Python CSV Read Data
Now that you know how to write data in a CSV file, let's discuss how you can read data from the CSV file using the Python
csv
module. To parse a CSV file in Python or to read data from a CSV file, we can use the
csv.reader()
method. In the above examples, we created a
movies.csv
file and wrote some data in it. Now, let's read the data from the same
movies.csv
file.
Example:
Python Parse CSV File and Read Data Using csv.reader()
The
csv.reader()
method parses the CSV file in Python and returns a reader iterable object. It is a list of rows data separated with commas, and like other iterable objects, we can use
Python
for
loop
to iterate over the returned value of the
reader()
method.
import csv
#open movies.csv file to read
with open("movies.csv", 'r') as file:
rows = csv.reader(file)
for row in rows:
print(row)
Output
['movieId', 'title', 'genres']
['1', 'Toy Story (1995)', 'Adventure|Animation|Children|Comedy|Fantasy']
['2', 'Jumanji (1995)', 'Adventure|Children|Fantasy']
['3', 'Grumpier Old Men (1995)', 'Comedy|Romance']
['4', 'Waiting to Exhale (1995)', 'Comedy|Drama|Romance']
['5', 'Father of the Bride Part II (1995)', 'Comedy']
['6', 'Heat (1995)', 'Action|Crime|Thriller']
['7', 'Sabrina (1995)', 'Comedy|Romance']
Note:
By default, the
csv.reader()
method reads the csv file based on the comma (,) delimiter. If your CSV file has a different delimiter like >, \t, >, $, @, and so on, you can explicitly specify the delimiter parameter to the reader method.
rows = csv.reader(file, delimiter=">")
Parse the CSV File to Dict in Python
The Python CSV module provides the
csv.DictReader()
method, which can parse the CSV file to a Python dictionary. The
csv.DictReader()
method returns a DictReader iterable object, which contains dictionary objects of the
columns:data
pair.
Example
import csv
#open movies.csv file to read
with open("movies.csv", 'r') as file:
rows = csv.DictReader(file)
for row in rows:
print(row)
Output
{'movieId': '1', 'title': 'Toy Story (1995)', 'genres': 'Adventure|Animation|Children|Comedy|Fantasy'}
{'movieId': '2', 'title': 'Jumanji (1995)', 'genres': 'Adventure|Children|Fantasy'}
{'movieId': '3', 'title': 'Grumpier Old Men (1995)', 'genres': 'Comedy|Romance'}
{'movieId': '4', 'title': 'Waiting to Exhale (1995)', 'genres': 'Comedy|Drama|Romance'}
{'movieId': '5', 'title': 'Father of the Bride Part II (1995)', 'genres': 'Comedy'}
{'movieId': '6', 'title': 'Heat (1995)', 'genres': 'Action|Crime|Thriller'}
{'movieId': '7', 'title': 'Sabrina (1995)', 'genres': 'Comedy|Romance'}
Reading and Writing CSV Files in Python Using the Pandas Library
pandas
is one of the most powerful Python libraries for data science. It comes with many built-in methods and features, and it is widely used for data manipulation and analysis. Using this library, we can write data in different file formats, including CSV. But in this Python tutorial, we will only be discussing writing and reading CSV files using Pandas. Unlike the Python
csv
module,
pandas
does not come pre-installed with Python. Therefore, before using the
pandas
library, make sure you have installed it. Installing the
pandas
library is very easy, and with the following Python pip install command, you can install pandas for your Python environment:
pip install pandas
Write a CSV File with the Pandas to_csv() Method
Creating or writing data in CSV files in Python using pandas is a bit tricky as compared to the Python
csv
module. That's because before creating a CSV file and writing data into it, we have to create a Pandas DataFrame. A pandas DataFrame can be understood as an n-dimensional array with rows and columns.
Example
import pandas as pd
#2d array of movies
movies_rows = [
['1', 'Toy Story (1995)', 'Adventure|Animation|Children|Comedy|Fantasy'],
['2', 'Jumanji (1995)', 'Adventure|Children|Fantasy'],
['3', 'Grumpier Old Men (1995)', 'Comedy|Romance'],
['4', 'Waiting to Exhale (1995)', 'Comedy|Drama|Romance'],
['5', 'Father of the Bride Part II (1995)', 'Comedy'],
['6', 'Heat (1995)', 'Action|Crime|Thriller'],
['7', 'Sabrina (1995)', 'Comedy|Romance'],
]
heading = ['movieId', 'title', 'genres']
#pandas dataframe
movies = pd. DataFrame(movies_rows, columns= heading )
#create the movies.csv file from dataframe
movies.to_csv("movies.csv")
This will create a
movies.csv
file in the same directory where your python script is located.
,movieId,title,genres
0,1,Toy Story (1995),Adventure|Animation|Children|Comedy|Fantasy
1,2,Jumanji (1995),Adventure|Children|Fantasy
2,3,Grumpier Old Men (1995),Comedy|Romance
3,4,Waiting to Exhale (1995),Comedy|Drama|Romance
4,5,Father of the Bride Part II (1995),Comedy
5,6,Heat (1995),Action|Crime|Thriller
6,7,Sabrina (1995),Comedy|Romance
Read from a CSV File in Python Using the pandas read_csv() Method
To read the CSV file in Python using
pandas
, we need to use the
pd.read_csv()
method. The
read_csv()
method accepts the CSV file name as a parameter and creates a Python pandas DataFrame.
Example:
import pandas as pd
df = pd.read_csv("movies.csv")
print(df)
Output
Unnamed: 0 ... genres
0 0 ... Adventure|Animation|Children|Comedy|Fantasy
1 1 ... Adventure|Children|Fantasy
2 2 ... Comedy|Romance
3 3 ... Comedy|Drama|Romance
4 4 ... Comedy
5 5 ... Action|Crime|Thriller
6 6 ... Comedy|Romance
Conclusion
If you just want to parse CSV files for reading and writing data, then you should use the Python Standard
CSV
module because using
pandas
for simple read and write file operations could be a high-performance task. To write data in a csv file using the standard
csv
module, we can use the
writer()
method along with the
writerow()
method. Also, to read data from the CSV file, we can use the
csv.reader()
method. In pandas, we first create a DataFrame and then write its data in the CSV file by using the
to_csv()
method, and to read data from the CSV file using pandas, we use the Pandas DataFrame
read_csv()
method.
People are also reading:
- How to use Gmail API in python to send mail?
- Install the Python Package Using Jupyter Notebook
- How to extract all stored chrome password with python
- Python readline Method with Examples
- How to automate login using selenium in python
- Python copy file and directory using shutil
- Python map() function with Examples
- How to delete emails in Python
- Python Coding on MacBook
- Python Counter in Collection
Leave a Comment on this Post