Encrypting Settings Files

When I was developing for Votizen, I always felt uneasy that we had our settings files with many important passwords and auth keys stored unencrypted on a third-party service. Sure, SSL was used for fetching and committing code, and only a few SSH public keys were allowed access to our repository, but it is one more opportunity for your site to be hacked. I recently read John Resig’s post Keeping Passwords in Source Control and decided with such an easy to follow tutorial, it was time to encrypt my Django settings.py files for my projects.

Today’s article describes my process for encrypting files and outlines some of the gotchas that I have faced. We will be using Git and Python/Django, but the lessons can easily be applied to another language and frameworks, as we are mostly going to wrap openssl, which is included on Linux and OSX.

How do it…

To encrypt a file using openssl, use the command:

openssl <cipher> -e -in<file to encrypt> -out<encrypted file name>

To decrypt a file using openssl, use the command:

openssl<cipher> -d -in<encrypted file name> -out<file to encrypt>

So for the Django settings file, using Triple-DES cipher:

openssl des3 -e -in settings.py -out settings.py.bin
openssl des3 -d -in settings.py.bin -out settings.py

Make sure .gitignore excludes settings.py, by adding the line (this replaces the settings_local.py line, if you’re using it):

settings*.py

For those using fabric to simplify scripting, the following functions will be useful:

def decrypt_file(file, cipher='des3'):
    """
    Decrypt a file.
    """
    d = {'cipher': cipher, 'file': file}
    local("openssl %(cipher)s -d -in %(file)s.bin -out %(file)s" % d)
    local("chmod 600 %(file)s" % d)

def encrypt_file(file, cipher='des3'):
    """
    Encrypt a file, default is the settings file.
    """
    local("openssl %(cipher)s -e -in %(file)s -out %(file)s.bin" % {
        'cipher': cipher,
        'file': file,
    })

And to run from the command line:

fab encrypt_file:settings.py
fab decrypt_file:settings.py

How it works…

We are using openssl to encrypt a file with the command openssl -e and to decrypt the file with the command openssl -d. There are a lot of ciphers to choose from (man openssl for more options), but I like Triple-DES, because it is widely used and well tested. When encrypting a file, openssl will prompt for a password. This same password will be used when decrypting the file.

This password is the first change to your development flow. You will need to remember it, and share it among your developers. Obviously, make it strong, but also memorable, because writing it down defeats the purpose. We will be checking in the encrypted settings files, instead of the normal ones. Since the encrypted settings files are checked in, when the repository is first checked out, or changes are made to the settings file in the remote repository, developers will need to rerun the decryption command to get the latest settings. If this becomes a problem, setup a githook or some other post-processing command that automatically decrypts settings when there are changes.

The next change to your development flow, is when new changes made to the settings file needs to be checked in. In this case, the developer makes their changes, and reruns the encryption command. This will generate a new settings.py.bin, replacing the old one, and is the file that is actually checked into the repository. This is the first big gotcha, since you are ignoring settings files in git, it won’t prompt that changes have been made to settings files. The developer will have to remember to encrypt the settings files. I have not found a good way to automate this, as checking the recentness of the timestamp on the settings files returns false positives if you make a change and then check in several times, or does not work when I change them days ago but are committing today.

The fabric functions above, always use the python filename as the file argument, automatically adding .bin to the in or out file, depending on the operation. The .gitignore line above (settings*.py) indicates that all settings files are to be ignored, as long as they following a naming convention that starts with the word settings and ends with .py (such as settings_prod.py and settings_dev.py, or just settings.py).

If you are like me and have several settings files, the core file (settings.py) and server specific files (settings_prod.py) that get imported on the correct systems, then you may need to encrypt multiple files. This technique can be applied to as many settings files as is needed for your system. For this blog, I use the following fab commands, to simplify encrypting and decrypting:

def decrypt_settings():
    decrypt_file("settings.py")
    decrypt_file("settings_dev.py")
    decrypt_file("settings_prod.py")

def encrypt_settings():
    encrypt_file("settings.py")
    encrypt_file("settings_dev.py")
    encrypt_file("settings_prod.py")

I always name my encrypted files .bin, instead of using the cipher, as John Resig did in his example. If you intend to use a single cipher, then this adds another level of security (through obscurity), as a hacker needs to determine what cipher was used. However, the ciphers are secure and even if a hacker know the cipher, it will not help the hacker much, so do what works for you.

There’s more…

As you can see, there are pros (security) and cons (workflow changes) to encrypting your settings files. I feel that security far outweighs the changes to my workflow (if a hacker gets your settings file, they pretty much get complete access to your site), but you should decide for yourself. The biggest two problems I have faced are: forgetting to checkin changes to my settings files and dealing with merge conflicts on the encrypted files. The former is hopefully caught by your remote deployment process, before going to the servers (if not, then you’re not testing enough). The latter is a bigger problem.

Since encrypted files cannot be merged, if there is a merge conflict, you have to handle it with care. My usual strategy, is to throw away my changes to the encrypted file and except the remote changes. I then backup my decrypted settings file to settings.py.bak and decrypt the remote settings file (overriding my settings file). Afterwards, I copy my changes into the new settings file, re-encrypt it, and push my changes. If my settings change was a one-liner, I will just extracted that line, instead of completely backing up the whole file.

Django Template Tags for JavaScript Deferment

I have been slowly working to improve my Django Shared project, which I use as the basis for all my Django projects. Recently, I added several new templatetags for deferring content and scripts: async_script, defer_html, defer_script, and render_deferred_html. Today’s article will cover how to use these templatetags in your own projects.

Getting ready

You will need to include django-shared in your own project, or extract the parts from templatetags/common.py. If you use pip, ...

A Python JSON Client for the LinkedIn API

The LinkedIn API is fairly robust and well documented, but is lacking a good JSON-based Python API for interacting with it. I recently opened-sourced LinkedIn-API-JSON-Client to fill this gap. It currently implements all the user profile related API calls, and is used in production by Votizen.com. This is a simple tutorial for how you can use it for your application as well.

Getting ready

You will need Python 2.x running in a ...

Using Sphinx to Easily Manage Engineering Documents

As an engineering organization grows, eventually there is a need to document more than just code comments. When this happens, there are many solutions for handling documentation, and most of them are equally bad. Previously, at Votizen.com we used a wiki, but it was difficult to organize, search, and links/documents rot. Recently, we chose to ditch the wiki, converting our documents to reStructuredText (.rst), and instead use the tool Sphinx to compile our documents ...

Django Foreign Key Object Patcher

This article may be helpful for optimizing query performance when fetching foreign key objects in Django on large tables. Most of the time using the select_related queryset function is enough to group the population of foreign key objects into the original query. However, if you are using a relational database and the tables used in select_related become sufficiently large (1m+ rows), then select_related will begin to perform very poorly. Today, we will discuss a generic ...

Deployment/Monitoring Strategies

This article finishes the series on building a continuous deployment environment using Python and Django.

If you have been following along, hopefully you're already on your way to building a continuous deployment environment. The final touch is to setup a deployment and monitoring ...

Using Celery to Handle Asynchronous Processes

This article continues the series on building a continuous deployment environment using Python and Django.

Those of you following along, now have the tools to setup a Python/Django project, fully test it, and deploy it. Today we will be discussing the Celery package, ...

Using Fabric for Painless Scripting

This article continues the series on building a continuous deployment environment using Python and Django.

If you have been following along, you now have to tools to setup a Python/Django project and fully test it. Today we will be discussing the Fabric package, ...

Coverage and Mock

This article continues the series on building a continuous deployment environment using Python and Django.

So far we have covered the basics of setting up a Django project and testing it. Today we will discuss how to ensure your tests fully cover the ...

Testing and Django

This article continues last weeks series on building a continuous deployment environment using Python and Django.

Before we discuss testing in Django, lets define what makes up good testing infrastructure. There are many schools of thoughts on testing, but that's another article. Ignoring ...