Encrypt and Decrypt PDF Files using Python

Automation Image

    Introduction

    Here's what Adobe has to say about PDFs : 

    PDFs run your world. You know you use PDFs to make your most important work happen. That's why we invented the Portable Document Format (better known by the abbreviation PDF), to present and exchange documents reliably — independent of software, hardware or operating system.
    The PDF is now an open standard, maintained by the International Organisation for Standardisation (ISO). PDF documents can contain links and buttons, form fields, audio, video, and business logic.

    You use PDFs almost every day. If you are a student, you have scanned copies of assignments to submit, or your resume can be in PDF format. Various other important documents come in PDF format. We share it with others. But as we share it, there are high chances of its data being leaked or stolen. Thus, it becomes necessary to encrypt its data or make it password-protected so that only genuine and authorized people can access it. 

    Some examples of password-protected PDFs that we encounter in daily life are:

    • Account statements from banks
    • Important governmental documents
    • Documents sent from companies

    In this blog, we'll learn how can we set a password to protect a PDF file. We’ll be using the PyPDF2 module to encrypt and decrypt our PDF files.

    PyPDF2 is an external library and needs to be installed using the command:

    pip install PyPDF2

    Once installed, we are ready to work with it. For demo purposes, you can download this PDF file.
     

    Encrypt PDF File

    First of all, let's create a function that checks whether a file is already encrypted.

    from PyPDF2 import PdfFileReader
    
    def is_encrypted(filename: str) -> bool:
        with open(filename, 'rb') as f:
            pdf_reader = PdfFileReader(f, strict=False)
            return pdf_reader.isEncrypted

    Now that we have a function ready to check whether the file is already encrypted or not, we can create our function that encrypts the file if it's not.

    def encrypt_file(filename: str, password: str) -> str:
        pdf_writer = PdfFileWriter()
        pdf_reader = PdfFileReader(open(filename, 'rb'), strict=False)
        if is_encrypted(filename):
            return "PDF File is already encrypted."
    
        try:
            for page_number in range(pdf_reader.numPages):
                pdf_writer.addPage(pdf_reader.getPage(page_number))
        except utils.PdfReadError:
            return "Error while reading PDF file"
    
        pdf_writer.encrypt(user_pwd=password, use_128bit=True)
        with open("encypted_demo.pdf", "wb") as f:
            pdf_writer.write(f)
            
        return "PDF file encrypted successfully"

    In the above code, we are creating a function  encrypt_file. Inside that, we are first checking if the file is already encrypted or not. If it is already encrypted, we simply return a message from there. Else, we iterate over each page of the PDF file and add it to the pdf_writer object created. We are using exception handling so that we can return an error message if any error is encountered while reading the file. After that, we are creating a new file and encrypting it with the given password. We have created a new file just to avoid any damage to the original file. Once it's done, we are returning a success message at the end.

    You can download the encrypted file from here.
     

    Decrypt PDF File

    Now that we have an encrypted file ready, let's try decrypting the same file. We can do that using the same library. 

    def decrypt_file(filename: str, password: str) -> str:
        pdf_writer = PdfFileWriter()
        pdf_reader = PdfFileReader(open(filename, 'rb'), strict=False)
        if not is_encrypted(filename):
            return "PDF File is not encrypted."
    
        pdf_reader.decrypt(password=password)
        try:
            for page_number in range(pdf_reader.numPages):
                pdf_writer.addPage(pdf_reader.getPage(page_number))
        except utils.PdfReadError:
            return "Error while reading PDF file"
    
        with open("decypted_demo.pdf", "wb") as f:
            pdf_writer.write(f)
    
        return "PDF file decrypted successfully"

    In the above code, we are creating a function  decrypt_file. Inside that, we are first checking if the file is already encrypted or not. If it is not encrypted, we simply return an error message from there. Else, we first decrypt it using the password and iterate over each page of the PDF file and add it to the pdf_writer object created. We are using exception handling so that we can return an error message if any error is encountered while reading the file. After that, we are creating a new file and writing everything to it. We have created a new file just to avoid any damage to the original file. Once it's done, we are returning a success message at the end.

    You can download the encrypted file from here.
     

    Conclusion

    In this blog, we created a basic application to encrypt and decrypt PDF files. You can create a GUI application using Tkinter or a web application using Flask or Django and use this script there. Hope you liked this post, thanks!

    0 Comments

    To add a comment, please Signup or Login