How to Crack PDF Files Passwords in Python?

Posted in /  

How to Crack PDF Files Passwords in Python?
vinaykhatri

Vinay Khatri
Last updated on November 26, 2024

    Let's say you have a PDF file that is locked or protected with a password, and you wish to crack that password with a Python program. So how to crack PDF file passwords in Python?

    Well, there are two approaches you can follow to crack the password of a PDF using Python code. In the first approach, you can create a key for the corresponding encryption technique that encrypted your PDF.

    Doing so, however, is hard because it is near to impossible to revert back an encrypting technique and retrieve a correct key because these days, to encrypt a file, the application can use more than a single encryption technique.

    Another approach you can use to crack PDF files passwords in Python is the " Brute Force Approach, " in which you have a list of thousands or millions of vulnerable and strong passwords.

    With the help of Python, you create a program that automates the process of passing the listed passwords to open the locked pdf file.

    How to Crack PDF Files Passwords in Python?

    In this Python program, we will be using the Brute Force approach to crack the password of a pdf file. The password for the file is 1a2b3c . Before discussing the Python program, let's install the required dependencies.

    Installing and Setting Up Dependencies

    1) The Python pikepdf Library

    The pikepdf library is an open-source Python library that is used to handle and manipulate PDF files. For this tutorial, we will be using this Python library to open the locked pdf file that is protected with a password. To install the pikepdf library for your Python environment, run the following Python pip install command on your terminal or command prompt:

    pip install pikepdf

    2) The passwords_list.txt File

    As we will be using the Brute Force Approach for this Python tutorial, we need to save the passwords_list.txt file that contains 10 million passwords that we will be applying to our pdf file. You download the passwords_list.txt file from the above link or copy and paste all the passwords from here .

    We would suggest you save the passwords_list.txt file in the same directory where your Python PDF password-protected file is located, so you can easily load the text file with a relative path.

    Python Program to Crack PDF Files Passwords in Python

    First, let's import the pikepdf library to our Python script.

    import pikepdf
    Now, declare two variables, passwords_filename , and locked_pdf_file that hold the filename of the passwords file and the locked pdf file.
    passwords_filename = "passwords_list.txt"
    locked_pdf_file = "my_locked.pdf"

    Next, first, we will open the password_list.txt file by the name file . Inside it, we will loop through every password and try to open the locked file. The try and except blocks will handle if the password is correct or not.
    #load passwords file
    with open(passwords_filename) as file:
        
        passwords_list = file.readlines()
        total_passwords = len(passwords_list)
    
        for index,password in enumerate(passwords_list):
            
            #try if password is correct
            try:
                with pikepdf.open(locked_pdf_file, password = password.strip()) as pdf_file:
                    print("\n++++++++++++++++++++++SUCCESS+++++++++++++++")
                    print("Success---------- File is Unlocked and the password is: ", password)
                    break
            #if password fail
            except:
                print("\n=====================")
                print(f"Trying Password {password} --- Fail!!!!")
                scanning =  (index/total_passwords)*100
                print("Scanning passwords complete:", round(scanning, 2))
                continue

    Now put all the code together and execute.

    # Python program to crack PDF password using Brute Force

    import pikepdf
    
    passwords_filename = "passwords_list.txt"
    
    locked_pdf_file = "my_locked.pdf"
    
    #load passwords file
    with open(passwords_filename) as file:
        
        passwords_list = file.readlines()
        total_passwords = len(passwords_list)
    
        for index,password in enumerate(passwords_list):
            
            #try if password is correct
            try:
                with pikepdf.open(locked_pdf_file, password = password.strip()) as pdf_file:
                    print("\n++++++++++++++++++++++SUCCESS+++++++++++++++")
                    print("Success---------- File is Unlocked and the password is: ", password)
                    break
            #if password fail
            except:
                print("\n=====================")
                print(f"Trying Password {password} --- Fail!!!!")
                scanning =  (index/total_passwords)*100
                print("Scanning passwords complete:", round(scanning, 2))
                continue

    Output

    Trying Password blondie
    --- Fail!!!!
    Scanning passwords complete: 0.15
    =====================
    Trying Password bigs
    --- Fail!!!!
    Scanning passwords complete: 0.15
    
    =====================
    Trying Password 272727
    --- Fail!!!!
    Scanning passwords complete: 0.15
    
    ++++++++++++++++++++++SUCCESS+++++++++++++++
    Success---------- File is Unlocked and the password is: 1a2b3c

    When you execute the program it might take 50 to 55 minutes to completely scan the passwords_list.txt file passwords with the locked pdf file.

    As you can see from the output that our pdf file password was 1a2b3c. The script is only able to find it out because the password is very weak and available in the list of passwords_list.txt file.

    If you have a pdf file that has a unique password, then this program will not be able to crack the password of your PDF file.

    Conclusion

    In this Python tutorial, you learned " How to Crack a PDF file in Python? " The approach we followed in this tutorial was Brute Force, in which we tried to open a password-protected PDF file using a file having 10 million passwords.

    If you have a locked pdf file and you have no idea about its password, then you can use this python program to crack that pdf file. However, it is not guaranteed that this program will crack your PDF file. Nonetheless, if the PDF file has a weak password, then this program might crack it.

    People are also reading:

    Leave a Comment on this Post

    0 Comments