This is the second post in a new series. If you want to track it through the entire editing process, you can follow along and contribute on GitHub. You can read the first post here

Building an Encryption System

In a straightforward application we normally break out the components – such as the encryption engine in an application server, the data in a database, and key management in an external service or appliance.

Or, for a legacy application, we might instead enable Transparent Database Encryption (TDE) for the database, with the encryption engine and data both on the same server, but key management elsewhere.

All data encryption systems are defined by where these pieces are located – which, even assuming everything works perfectly, define the protection level of the data. We will go into the different layers of data encryption in the next section, but at a high level they are:

  • In the application where you collect the data.
  • In the database that holds the data.
  • In the files where the data is stored.
  • On the storage volume (typically a hard drive, tape, or virtual storage) where the files reside.

All data flows through that stack (sometimes skipping applications and databases for unstructured data). Encrypt at the top and the data is protected all the way down, but this adds complexity to the system and isn’t always possible. Once we start digging into the specifics of different encryption options you will see that defining your requirements almost always naturally leads you to select a particular layer, which then determines where to place the components.

The Three Laws of Data Encryption

Years ago we developed the Three Laws of Data Encryption as a tool to help guide the encryption decisions listed above. When push comes to shove, there are only three reasons to encrypt data:

  1. If the data moves, physically or virtually.
  2. To enforce separation of duties beyond what is possible with access controls. Usually this only means protecting against administrators because access controls can stop everyone else.
  3. Because someone says you have to. We call this “mandated encryption”.

Here is an example of how to use the rules. Let’s say someone tells you to “encrypt all the credit card numbers” in a particular application. Let’s further say the reason is to prevent loss of data if a database administrator account is compromised, which eliminates our third reason.

The data isn’t necessarily moving, but we want separation of duties to protect the database even if someone steals administrator credentials. Encrypting at the storage volume layer wouldn’t help because a compromised administrative account still has access within the database. Encrypting the database files alone wouldn’t help either.

Encrypting within the database is an option, depending on where the keys are stored (they must be outside the database) and some other details we will get to later. Encrypting in the application definitely helps, since that’s completely outside the database. But in either cases you still need to know when and where an administrator could potentially access decrypted data.

That’s how it all ties together. Know why you are encrypting, then where you can potentially encrypt, then how to position the encryption components to achieve your security objectives.

Tokenization and Data Masking

Two alternatives to encryption are sometimes offered in commercial encryption tools: tokenization and data masking. We will spend more time on them later, but simply define them for now:

  • Tokenization replaces a sensitive piece of data with a random piece of data that can fit the same format (such as by looking like a credit card number without actually being a valid credit card number). The sensitive data and token are then stored together in a highly secure database for retrieval under limited conditions.
  • Data masking replaces sensitive data with random data, but the two aren’t stored together for later retrieval. Masking can be a one-way operation, such as generating a test database, or a repeatable operation such as dynamically masking a specific field for an application user based on permissions.

For more information on tokenization vs. encryption you can read our paper.

That covers the basics of encryption systems. Our next section will go into details of the encryption layers above before delving into key management, platform features, use cases, and the decision tree to pick the right option.

Share: