Let's say you have a PDF file that is locked or protected with a password, and you wish to crack that password with a Python program. So how to crack PDF file passwords in Python?
Well, there are two approaches you can follow to crack the password of a PDF using Python code. In the first approach, you can create a key for the corresponding encryption technique that encrypted your PDF.
Doing so, however, is hard because it is near to impossible to revert back an encrypting technique and retrieve a correct key because these days, to encrypt a file, the application can use more than a single encryption technique.
Another approach you can use to crack PDF files passwords in Python is the " Brute Force Approach, " in which you have a list of thousands or millions of vulnerable and strong passwords.
With the help of Python, you create a program that automates the process of passing the listed passwords to open the locked pdf file.
How to Crack PDF Files Passwords in Python?
In this Python program, we will be using the Brute Force approach to crack the password of a pdf file. The password for the file is 1a2b3c . Before discussing the Python program, let's install the required dependencies.
Installing and Setting Up Dependencies
1) The Python
pikepdf
Library
The
pikepdf
library is an open-source Python library that is used to handle and manipulate PDF files. For this tutorial, we will be using this Python library to open the locked pdf file that is protected with a password. To install the
pikepdf
library for your Python environment, run the following Python pip install command on your terminal or command prompt:
pip install pikepdf
2) The passwords_list.txt File
As we will be using the Brute Force Approach for this Python tutorial, we need to save the
passwords_list.txt
file that contains 10 million passwords that we will be applying to our pdf file. You download the
passwords_list.txt
file from the above link or copy and paste all the passwords from
here
.
We would suggest you save the
passwords_list.txt
file in the same directory where your Python PDF password-protected file is located, so you can easily load the text file with a relative path.
Python Program to Crack PDF Files Passwords in Python
First, let's import the
pikepdf
library to our Python script.
import pikepdf
Now, declare two variables,
passwords_filename
, and
locked_pdf_file
that hold the filename of the passwords file and the locked pdf file.
passwords_filename = "passwords_list.txt"
locked_pdf_file = "my_locked.pdf"
password_list.txt
file by the name
file
. Inside it, we will loop through every password and try to open the locked file. The
try
and
except
blocks will handle if the password is correct or not.
#load passwords file
with open(passwords_filename) as file:
passwords_list = file.readlines()
total_passwords = len(passwords_list)
for index,password in enumerate(passwords_list):
#try if password is correct
try:
with pikepdf.open(locked_pdf_file, password = password.strip()) as pdf_file:
print("\n++++++++++++++++++++++SUCCESS+++++++++++++++")
print("Success---------- File is Unlocked and the password is: ", password)
break
#if password fail
except:
print("\n=====================")
print(f"Trying Password {password} --- Fail!!!!")
scanning = (index/total_passwords)*100
print("Scanning passwords complete:", round(scanning, 2))
continue
Now put all the code together and execute.
# Python program to crack PDF password using Brute Force
import pikepdf
passwords_filename = "passwords_list.txt"
locked_pdf_file = "my_locked.pdf"
#load passwords file
with open(passwords_filename) as file:
passwords_list = file.readlines()
total_passwords = len(passwords_list)
for index,password in enumerate(passwords_list):
#try if password is correct
try:
with pikepdf.open(locked_pdf_file, password = password.strip()) as pdf_file:
print("\n++++++++++++++++++++++SUCCESS+++++++++++++++")
print("Success---------- File is Unlocked and the password is: ", password)
break
#if password fail
except:
print("\n=====================")
print(f"Trying Password {password} --- Fail!!!!")
scanning = (index/total_passwords)*100
print("Scanning passwords complete:", round(scanning, 2))
continue
Output
Trying Password blondie
--- Fail!!!!
Scanning passwords complete: 0.15
=====================
Trying Password bigs
--- Fail!!!!
Scanning passwords complete: 0.15
=====================
Trying Password 272727
--- Fail!!!!
Scanning passwords complete: 0.15
++++++++++++++++++++++SUCCESS+++++++++++++++
Success---------- File is Unlocked and the password is: 1a2b3c
When you execute the program it might take 50 to 55 minutes to completely scan the passwords_list.txt file passwords with the locked pdf file.
As you can see from the output that our pdf file password was 1a2b3c. The script is only able to find it out because the password is very weak and available in the list of passwords_list.txt file.
If you have a pdf file that has a unique password, then this program will not be able to crack the password of your PDF file.
Conclusion
In this Python tutorial, you learned " How to Crack a PDF file in Python? " The approach we followed in this tutorial was Brute Force, in which we tried to open a password-protected PDF file using a file having 10 million passwords.
If you have a locked pdf file and you have no idea about its password, then you can use this python program to crack that pdf file. However, it is not guaranteed that this program will crack your PDF file. Nonetheless, if the PDF file has a weak password, then this program might crack it.
People are also reading:
Leave a Comment on this Post