Tag Archives: Column Encryption

How GitHub converts previously encrypted and unencrypted columns to ActiveRecord encrypted columns

Post Syndicated from Kylie Stradley original https://github.blog/2022-11-03-how-github-converts-previously-encrypted-and-unencrypted-columns-to-activerecord-encrypted-columns/

Background

In the first post in this series, we detailed how we designed our easy‐to‐use column encryption paved path. We found during the rollout that the bulk of time and effort was spent in robustly supporting the reading and upgrading of previous encryption formats/plaintext and key rotation. In this post, we’ll explain the design decisions we made in our migration plan and describe a simplified migration pattern you can use to encrypt (or re-encrypt) existing records in your Rails application.

We have two cases for encrypted columns data migration–upgrading plaintext or previously encrypted data to our new standard and key rotation.

Upon consulting the Rails documentation to see if there was any prior art we could use, we found the previous encryptor strategy but exactly how to migrate existing data is, as they say, an “exercise left for the reader.”

Dear reader, lace up your sneakers because we are about to exercise. 👟

To convert plaintext columns or columns encrypted with our deprecated internal encryption library, we used ActiveRecord::Encryption’s previous encryptor strategy, our existing feature flag mechanism and our own type of database migration called a transition. Transitions are used by GitHub to modify existing data, as opposed to migrations that are mainly used to add or change columns. To simplify things and save time, in the example migration strategy, we’ll rely on the Ruby gem, MaintenanceTasks.

Previous encryptor strategy

ActiveRecord::Encryption provides as a config option config.active_record.encryption.support_unencrypted_data that allows plaintext values in an encrypted_attribute to be read without error. This is enabled globally and could be a good strategy to use if you are migrating only plaintext columns and you are going to migrate them all at once. We chose not to use this option because we want to migrate columns to ActiveRecord::Encryption without exposing the ciphertext of other columns if decryption fails. By using a previous encryptor, we can isolate this “plaintext mode” to a single model.

In addition to this, GitHub’s previous encryptor uses a schema validator and regex to make sure that the “plaintext” being returned does not have the same shape as Rails encrypted columns data.

Feature flag strategy

We wanted to have fine-grained control to safely roll out our new encryption strategy, as well as the ability to completely disable it in case something went wrong, so we created our own custom type using the ActiveModel::Type API, which would only perform encryption when the feature flag for our new column encryption strategy was disabled.

A common feature flag strategy would be to start a feature flag at 0% and gradually ramp it up to 100% while you observe and verify the effects on your application. Once a flag is verified at 100%, you would remove the feature flag logic and delete the flag. To gradually increase a flag on column encryption, we would need to have an encryption strategy that could handle plaintext and encrypted records both back and forth because there would be no way to know if a column was encrypted without attempting to read it first. This seemed like unnecessary additional and confusing work, so we knew we’d want to use flagging as an on/off switch.

While a feature flag should generally not be long running, we needed the feature flag logic to be long running because we want it to be available for GitHub developers who will want to upgrade existing columns to use ActiveRecord::Encryption.

This is why we chose to inverse the usual feature flag default to give us the flexibility to upgrade columns incrementally without introducing unnecessary long‐running feature flags. This means we set the flag at 100% to prevent records from being encrypted with the new standard and set it to 0% to cause them to be encrypted with our new standard. If for some reason we are unable to prioritize upgrading a column, other columns do not need to be flagged at 100% to continue to be encrypted on our new standard.

We added this logic to our monkeypatch of ActiveRecord::Base::encrypts method to ensure our feature flag serializer is used:

Code sample 1

self.attribute(attribute) do |cast_type|
    GitHub::Encryption::FeatureFlagEncryptedType.new(cast_type: cast_type, attribute_name: attribute, model_name: self.name)
end

Which instantiates our new ActiveRecord Type that checks for the flag in its serialize method:

Code sample 2

# frozen_string_literal: true

module GitHub
  module Encryption
    class FeatureFlagEncryptedType < ::ActiveRecord::Type::Text
      attr_accessor :cast_type, :attribute_name, :model_name


      # delegate: a method to make a call to `this_object.foo.bar` into `this_object.bar` for convenience
      # deserialize: Take a value from the database, and make it suitable for Rails
      # changed_in_place?: determine if the value has changed and needs to be rewritten to the database
      delegate :deserialize, :changed_in_place?
, to: :cast_type

      def initialize(cast_type:, attribute_name:, model_name:)
        raise RuntimeError, "Not an EncryptedAttributeType" unless cast_type.is_a?(ActiveRecord::Encryption::EncryptedAttributeType)

        @cast_type = cast_type
        @attribute_name = attribute_name
        @model_name = model_name
      end


      # Take a value from Rails and make it suitable for the database
      def serialize(value)
        if feature_flag_enabled?("encrypt_as_plaintext_#{model_name.downcase}_#{attribute_name.downcase}")
          # Fall back to plaintext (ignore the encryption serializer)
          cast_type.cast_type.serialize(value)
        else
          # Perform encryption via active record encryption serializer
          cast_type.serialize(value)
        end
      end
    end
  end
end

A caveat to this implementation is that we extended from ActiveRecord::Type::Text which extends from ActiveModel::Type:String, which implements changed_in_place? by checking if the new_value is a string, and, if it is, does a string comparison to determine if the value was changed.

We ran into this caveat during our roll out of our new encrypted columns. When migrating a column previously encrypted with our internal encryption library, we found that changed_in_place? would compare the decrypted plaintext value to the encrypted value stored in the database, always marking the record as changed in place as these were never equal. When we migrated one of our fields related to 2FA recovery codes, this had the unexpected side effect of causing them to all appear changed in our audit log logic and created false-alerts in customer facing security logs. Fortunately, though, there was no impact to data and our authentication team annotated the false alerts to indicate this to affected customers.

To address the cause, we delegated the changed_in_place? to the cast_type, which in this case will always be ActiveRecord::Encryption::EncryptedAttributeType that attempts to deserialize the previous value before comparing it to the new value.

Key rotation

ActiveRecord::Encryption accommodates for a list of keys to be used so that the most recent one is used to encrypt records, but all entries in the list will be tried until there is a successful decryption or an ActiveRecord::DecryptionError is raised. On its own, this will ensure that when you add a new key, records that are updated after will automatically be re-encrypted with the new key.

This functionality allows us to reuse our migration strategy (see code sample 5) to re-encrypt all records on a model with the new encryption key. We do this simply by adding a new key and running the migration to re-encrypt.

Example migration strategy

This section will describe a simplified version of our migration process you can replicate in your application. We use a previous encryptor to implement safe plaintext support and the maintanence_tasks gem to backfill the existing records.

Set up ActiveRecord::Encryption and create a previous encryptor

Because this is a simplified example of our own migration strategy, we recommend using a previous encryptor to restrict the “plaintext mode” of ActiveRecord::Encryption to the specific model(s) being migrated.

Set up ActiveRecord::Encryption by generating random key set:

bin/rails db:encryption:init

And adding it to the encrypted Rails.application.credentials using:

bin/rails credentials:edit

If you do not have a master.key, this command will generate one for you. Remember never to commit your master key!

Create a previous encryptor. Remember, when you provide a previous strategy, ActiveRecord::Encryption will use the previous to decrypt and the current (in this case ActiveRecord’s default encryptor) to encrypt the records.

Code sample 3

app/lib/encryption/previous_encryptor.rb

# frozen_string_literal: true

module Encryption
  class PreviousEncryptor
    def encrypt(clear_text, key_provider: nil, cipher_options: {})
        raise NotImplementedError.new("This method should not be called")
    end

    def decrypt(previous_data, key_provider: nil, cipher_options: {})
      # JSON schema validation
        previous_data
    end
  end
end

Add the previous encryptor to the encrypted column

Code sample 4

app/models/secret.rb
class Secret < ApplicationRecord
  encrypts :code, previous: { encryptor: Encryption::PreviousEncryptor.new }
end

The PreviousEncryptor will allow plaintext records to be read as plaintext but will encrypt all new records up until and while the task is running.

Install the Maintenance Tasks gem and create a task

Install the Maintenance Tasks gem per the instructions and you will be ready to create the maintenance task.

Create the task.

bin/rails generate maintenance_tasks:task encrypt_plaintext_secrets

In day‐to‐day use, you shouldn’t ever need to call secret.encrypt because ActiveRecord handles the encryption before inserting into the database, but we can use this API in our task:

Code sample 5

app/tasks/maintenance/encrypt_plaintext_secrets_task.rb

# frozen_string_literal: true

module Maintenance
  class EncryptPlaintextSecretsTask < MaintenanceTasks::Task
    def collection
      Secret.all
    end

    def process(element)
      element.encrypt
    end
      …
  end
end

Run the Maintenance Task

Maintenance Tasks provides several options to run the task, but we use the web UI in this example:

Screenshot of the Maintenance Tasks web UI.

Verify your encryption and cleanup

You can verify encryption in Rails console, if you like:

Screenshot of the Rails console

And now you should be able to safely remove your previous encryptor leaving the model of your newly encrypted column looking like this:

Code sample 6

app/models/secret.rb

class Secret < ApplicationRecord
  encrypts :code
end

And so can you!

Encrypting database columns is a valuable extra layer of security that can protect sensitive data during exploits, but it’s not always easy to migrate data in an existing application. We wrote this series in the hope that more organizations will be able to plot a clear path forward to using ActiveRecord::Encryption to start encrypting existing sensitive values.

Why and how GitHub encrypts sensitive database columns using ActiveRecord::Encryption

Post Syndicated from Kylie Stradley original https://github.blog/2022-10-26-why-and-how-github-encrypts-sensitive-database-columns-using-activerecordencryption/

You may know that GitHub encrypts your source code at rest, but you may not have known that we also encrypt sensitive database columns in our Ruby on Rails monolith. We do this to provide an additional layer of defense in depth to mitigate concerns, such as:

  • Reading or tampering with sensitive fields if a database is inappropriately accessed
  • Accidentally exposing sensitive data in logs

Motivation

Until recently, we used an internal library called Encrypted Attributes. GitHub developers would declare a column should be encrypted using an API that might look familiar if you have used ActiveRecord::Encryption:

class PersonalAccessToken
  encrypted_attribute :encrypted_token, :plaintext_token
end

Given that we had an existing implementation, you may be wondering why we chose to take on the work of converting our columns to ActiveRecord::Encryption. Our main motivation was to ensure that developers did not have to learn a GitHub-specific pattern to encrypt their sensitive data.

We believe strongly that using familiar, intuitive patterns results in better adoption of security tools and, by extension, better security for our users.

In addition to exposing some of the implementation details of the underlying encryption, this API did not provide an easy way for developers to encrypt existing columns. Our internal library required a separate encryption key to be generated and stored in our secure environment variable configuration—for each new database column. This created a bottleneck, as most developers don’t work with encryption every day and needed support from the security team to make changes.

When assessing ActiveRecord::Encryption, we were particularly interested in its ease of use for developers. We wanted a developer to be able to write one line of code, and no matter if their column was previously plaintext or used our previous solution, their column would magically start using ActiveRecord::Encryption. The final API looks something like this:

class PersonalAccessToken
  encrypts :token
end

This API is the exact same as what is used by traditional ActiveRecord::Encryption while hiding all the complexity of making it work at GitHub scale.

How we implemented this

As part of implementing ActiveRecord::Encryptioninto our monolith, we worked with our architecture and infrastructure teams to make sure the solution met GitHub’s scalability and security requirements. Below is a brief list of some of the customizations we made to fit the implementation to our infrastructure.

As always, there are specific nuances that must be considered when modifying existing encryption implementations, and it is always a good practice to review any new cryptography code with a security team.

Diagram 1: Key access and derivation flow for GitHub’s `ActiveRecord::Encryption` implementation

Secure primary key storage

By default, Rails uses its built-in credentials.yml.enc file to securely store the primary key and static salt used for deriving the column encryption key in ActiveRecord::Encryption.

GitHub’s key management strategy for ActiveRecord::Encryption differs from the Rails default in two key ways: deriving a separate key per column and storing the key in our centralized secret management system.

Deriving per-column keys from a single primary key

As explained above, one of the goals of this transition was to no longer bottleneck teams by managing keys manually. We did, however, want to maintain the security properties of separate keys. Thankfully, cryptography experts have created a primitive known as a Key Derivation Function (KDF) for this purpose. These functions take (roughly) three important parameters: the primary key, a unique salt, and a string termed “info” by the spec.

Our salt is simply the table name, an underscore, and the attribute name. So for “PersonalAccessTokens#token” the salt would be “personal_access_tokens_token”. This ensures the key is different per column.

Due to the specifics of the ActiveRecord::Encryption algorithm (AES256-GCM), we need to be careful not to encrypt too many values using the same key (to avoid nonce reuse). We use the “info” string parameter to ensure the key for each column changes automatically at least once per year. Therefore, we can populate the info input with the current year as a nonce during key derivation.

The applications that make up GitHub store secrets in Hashicorp Vault. To conform with this pre-existing pattern, we wanted to pull our primary key from Vault instead of the credentials.yml.enc file. To accommodate for this, we wrote a custom key provider that behaves similarly to the default DerivedSecretKeyProvider, retrieving the key from Vault and deriving the key with our KDF (see Diagram 1).

Making new behavior the default

One of our team’s key principles is that solutions we develop should be intuitive and not require implementation knowledge on the part of the product developer. ActiveRecord::Encryption includes functionality to customize the Encryptor used to encrypt data for a given column. This functionality would allow developers to optionally use the strategies described above, but to make it the default for our monolith we needed to override the encrypts model helper to automatically select an appropriate GitHub-specific key provider for the user.

{
def self.encrypts(*attributes, key_provider: nil, previous: nil, **options)
      # snip: ensure only one attribute is passed
# ...

    # pull out the sole attribute
    attribute = attributes.sole

      # snip: ensure if a key provider is passed, that it is a GitHubKeyProvider
      # ...

    # If no key provider is set, instantiate one
    kp = key_provider || GitHub::Encryption::GitHubKeyProvider.new(table: table_name.to_sym, attribute: attribute)

      # snip: logic to ensure previous encryption formats and plaintext are supported for smooth transition (see part 2)
      # github_previous = ...

    # call to rails encryption
    super(attribute, key_provider: kp, previous: github_previous, **options)
end
}

Currently, we only provide this API to developers working on our internal github.com codebase. As we work with the library, we are experimenting with upstreaming this strategy to ActiveRecord::Encryption by replacing the per-class encryption scheme with a per-column encryption scheme.

Turn off compression by default

Compressing values prior to encryption can reveal some information about the content of the value. For example, a value with more repeated bytes, such as “abcabcabc,” will compress better than a string of the same length, such as “abcdefghi”. In addition to the common encryption property that ciphertext generally exposes the length, this exposes additional information about the entropy (randomness) of the underlying plaintext.

ActiveRecord::Encryption compresses data by default for storage efficiency purposes, but since the values we are encrypting are relatively small, we did not feel this tradeoff was worth it for our use case. This is why we replaced the default to compress values before encryption with a flag that makes compression optional.

Migrating to a new encryption standard: the hard parts

This post illustrates some of the design decisions and tradeoffs we encountered when choosing ActiveRecord::Encryption, but it’s not quite enough information to guide developers of existing applications to start encrypting columns. In the next post in this series we’ll show you how we handled the hard parts—how to upgrade existing columns in your application from plaintext or possibly another encryption standard.