We all love pickles, the way a pickle is made by preserving the main ingredient for a period of time. The concept of preserving is the main idea behind every type of pickle. Similarly, Python also uses Pickling to persevere or store Python objects , generally also referred to as Python picking, serialization, marshaling, and flattening.
What is Serialization in Python?
When we want to transmit data from one resource to another or one network to another, we need a data format that can easily be transmitted over memory or networks. A Python object or data can not be transmitted directly, so we convert that data into a stream of bytes and store it using a process called Serialization.
Python provides many libraries to serialize data, such as JSON , marshal, and pickle, and in this tutorial, we will learn about the Python pickle module and see how to serialize and deserialize data or Python objects using the Pickle module.
What is de-serialization in Python?
De-serialization is just the opposite of Serialization. The data or object that is converted into a transmittable file format (using serialization) need to get back into a Python object or data so it can be read by the user. This process of converting back the serialized bytes data stream back into Python object or data is known as de-serialization.
What is Pickle in Python
Pickle is one of Python's standard Libraries that is used to serialize and de-serialize Python objects. Pickle can serialize any Python data object into a byte stream, including list, string, tuple, dictionary, function, class, etc., and revert back them into Python object using de-serialization.
Although, many other Programming languages support serialization, what makes Pickling different is that it can serialize any Python data object, whereas serialization has its own limitation and can only serialize limited data objects. That's why Pickling has its own different meaning than serialization, and Python developers do not interchangeably use the terms Picking and Serialization.
JSON vs Pickle
JSON "JavaScript Object Notation" is one of the most serialized data formats to send data over networks. Almost every programming language supports a library or a data format to serialize its data objects to JSON format. Python also has a standard JSON library that can serialize and deserialize Python objects to JSON format and vice versa. But the difference between JSON and Pickle are:
-
-
json
is limited to a few Python data objects means it can only serialize Python JSON-looking data like string and dictionary into a serialized JSON format. Whereaspickle
is capable of serializing every Python object. -
The
json
module is compatible with most Python versions. But inpickle,
you might face some bugs and errors while serializing or deserializing the same data with different Python versions. -
json
provides a more suitable serialization format to send data over the network, andpickle
is more suitable for sharing and storing data over memory. -
With big data sets
pickle
provide more efficiency thanjson
.
-
Python Objects that can be Pickled and Unpickled
Here is a list of Python data objects that can be serialized and deserialized using the Python Pickle library.
- All Python Data types (Integer, Float, String, Boolean, Bytes, None)
- Python Data containers (List, Tuple, Dictionary, Sets)
- Python functions and Classes.
Pickle Python Objects
Pickle Python List
Let's begin with pickling (serializing) a Python list into a
pickle_file.pkl
. Create a
pickle_list.py
Python script and code along.
#pickle_list.py
import pickle #list object fruits_list = ['Apples', 'Mangoes', 'Grapes', 'Peaches', 'Oranges'] filename ='pickle_file.pkl' #create a pickle_file.pkl # and serialize Python objects as binary file with open (filename, 'wb') as pickle_file: pickle.dump(fruits_list, pickle_file) print(f"A file by name {filename} has been created with serialized data")
Break the code
In the first line, we imported the pickle module using the
import pickle
statement. Pickle is a Python standard library, so we do not need to install it separately. We can directly use it on our Python script. Then we define a List object by name
fruits_list
which is a list of fruits.
The
filename
identifier holds the fine name of the Pickled list. Using the Python context manager, we open the file
filename
in write binary mode
'wb'
as a file object
pickle_file
.
It is important to open the file in write binary mode
wb
when storing Pickled (serialized) data. The
pickle.dump(data_object, file_object)
function accepts two arguments
object
and
file_object
. It serialized the data_object and store it in the file with
file_object
.
Now execute the program
python pickle_list.py A file by name pickle_file.pkl has been created with serialized data
After executing the program, you will find a file with a filename
pickle_file.pkl
will be created in the same directory where your Python
pickle_list.py
is located.
The
pickle_file.pkl
will contain the serialized Python list object in binary format. We can read the serialized data directly from this newly created
.pkl
file, but it will be in binary format. So we need to read it in binary format and deserialize the fruits_list data.
Unpickle Python List
Now let's create a new Python script by the name
unpickle_list.py
and deserialized and read the data that we created and stored in the above example as
pickle_file.pkl
.
#unpickle_list.py
import pickle filename ='pickle_file.pkl' #load t pickle_file.pkl # and de-serialize Python objects with open (filename, 'rb') as pickle_file: fruits_list = pickle.load(pickle_file) print(fruits_list)
Output
['Apples', 'Mangoes', 'Grapes', 'Peaches', 'Oranges']
Break the code
To deserialize the
'pickle_file.pkl'
file, we first open the file and read the data in binary mode using
'rb'
. Then using the
pickle.load()
function, we de-serialized the
pickle_file
file serialized data.
Wrapping Up
Now let's wrap up our article on the Python Pickle module. In this article, we learned what serialization & deserialization is, what the Pickle module in Python is, how to store serialized Python objects, and how to deserialize Python objects using the Pickle module.
To serialize and store any Python object, we first need to open the file in binary write mode
"wb"
and serialize the Python data object using the Pickle
dump()
function. The dump() function serialize the Python data object and writes the serialized data into a
.pkl
file in a binary format that can be easily read and transmitted later. To deserialize and read data from the
.pkl
file we first need to read the data in binary format using the
"rb"
mode, then deserialize data using the Pickle
load()
function.
However, Pickle's serialized file makes it easy to send the data over the network and does not confuse pickling with compression. Compression is used to encode data to reduce disk space, whereas serialization is only used to translate data for better transmission.
People are also reading:
Leave a Comment on this Post