Data Residency Compliance using Baffle and BYOK

By Billy VanCannon, Director of Product Management | November 1, 2023

Baffle provides strong encryption and data key management, while also allowing our customers and their tenants to “bring your own key” (BYOK) or “hold your own key” (HYOK) for maximum control over their sensitive data. Baffle’s powerful and flexible architecture will drop into your current infrastructure without application code changes and is easily adapted to provide decryption or masked data only at the intended destination. Baffle enables encryption for logical data isolation, which in-turn, enables highly scalable multi-tenant designs. The capabilities are essential for meeting today’s compliance and security demands around data residency and data sovereignty.

The European Unions’ GDPR, or General Data Protection Regulation is a top priority for any company that wants to do business with EU residents. GDPR was introduced in 2016 and went into enforcement in 2018. GDPR does not require that personal data be kept in the EU but does demand that the data is protected with all the same safeguards and redress rights afforded by GDPR.

GDPR has three mechanisms for allowing data outside of the EU. The first mechanism is literally called “Adequacy,” and it means that the EU has determined that the country in question provides similar data rights as the EU. Unfortunately, the US and EU have gone back and forth for over 20 years now (yes, even before GDPR was introduced) on whether the US provides enough data rights for EU residents. The most recent bout includes President Biden’s executive order in October 2022 outlining sensitive data protections and the EU approved it. However, litigation in Europe began almost immediately that the order was not adequate. This is still playing out. The second mechanism is a set of possibilities outlined in Article 46(1) that comes down to commercial contracts where the data controllers and processors are legally bound to uphold the data rights of EU residents. The primary problem with both mechanisms in the US (as far as the EU is concerned) is that US intelligence and law enforcement agencies can often subpoena the organizations with personal data and compel them to provide it, overriding the commercial contracts. However, there is a third possibility.

GDPR allows “supplementary controls” on personal data that provide EU levels of protection. The European Data Protection Board wrote this document outlining recommendations for providing such protection, and the bottom line is strong encryption with proper key management. It has sample use cases that include storing the data and protecting it from “public authorities.”

How can this be applied to the real world? Continue reading to understand Baffle’s approach to key management and then how Baffle’s flexible architecture can be adapted to almost any system.

Envelope Encryption

Envelope Encryption
Figure 1. Envelope Encryption

As shown in the diagram, Baffle uses two-tiers of keys to encrypt data, using a technique known as envelope encryption. In this approach, the keys used to encrypt data are themselves encrypted or “wrapped” using a symmetric key encryption key (KEK) that never leaves the key store. The wrapping algorithm uses AES encryption with a 256-bit key length to prevent compromise of the data encryption key (DEK). Because the DEK is encrypted, it can optionally be stored in unsecured or less trusted locations for ease of storage, management, and recovery. The use of key encryption or wrapping keys in this way is a well-accepted approach for key management and described within the National Institute Standards and Technology (NIST) Special Publication 800-57 Part 1, “Recommendations for Key Management.” PCI-DSS describes envelope encryption and proper key storage in PCI-DSS v4.0 requirements 3.6.1. For ISO key management, see ISO 11568-1 Banking – Key management Part 1.

Envelope encryption is not only a secure approach to key management, but it enables the seemingly contradictory requirements of allowing the use of the customer KEKs -where Baffle software never has access to the actual key values, while Baffle does the DEK mapping to data and management so the applications (i.e. our customers) don’t have to.

Global key management

“Global” means that the same DEK is used to encrypt all the same type of sensitive data. In a relational database, this translates to an entire column, say of social security numbers or national ids, where every record is encrypted with the same DEK. Global key management is ideal for single-tenant applications.

Multitenant key management

Unique to Baffle is the ability to cryptographically isolate the data of different tenants within a system by encrypting each tenant’s data with a different key. This capability is particularly useful for enterprises that store and process data on behalf of other organizations.

In keeping with Baffle’s goal of making it easy to adopt encryption, Baffle Data Protection supports the most common ways tenant data is organized including:

  • Commingled records in one or more tables with a column to identify the owner each data record
  • Independent logical databases for each data owner

Users can configure a key store, KEK, and DEK to use for each tenant, and Baffle Data Protection will automatically locate and use the appropriate keys when encrypting or decrypting each tenant’s data. Under no circumstances would a tenant be able to decrypt the data of another tenant.

In the case where tenant data are commingled records in a common set of tables, Baffle can use the value in the tenant identifier column to identify the correct key to use when encrypting or decrypting the records associated with the query. In cases where the tenant identifier is unable within the context of the query, Baffle can also use the database session variables, roles, or end-user identity that the application provides in the query to automatically choose the right record-level key (RLK) to use.

Encryption Using Record-level Keys
Figure 2a. Encryption Using Record-level Keys

In the case where tenant data is in logical databases, Baffle can use the logical database name to identify the correct logical-database key (LDK) to use to encrypt or decrypt the data for the query.

Encryption Using Logical-database Keys
Figure 2b. Encryption Using Logical-database Keys

Baffle Architecture

Baffle architecture is ideal for organizations that need to isolate data sets from different geographic regions to comply with data sovereignty requirements. See figure 3. At the highest level, there are two components of a Baffle implementation, Baffle Manager and Baffle Shield.

Baffle Manager is an API and GUI-based controller for configuration and auditing of Baffle Shields.

Baffle Shields are reverse proxies that intercept communications at the SQL session layer, meaning that they can be implemented between your current applications and corresponding databases without significant impact to either. They encrypt on “WRITE” commands and decrypt or mask on “READ” commands.

Envelope encryption is shown well here. Baffle Shield connects the DEK store to pull encrypted DEKs and then sends them to the KEK store for decryption. The DEK can then be used to encrypt or decrypt the data. After a user-programmable amount of time, the DEK is deleted from memory. By allowing the user (application owner or its tenants) set the path to their own HSM or KMS, they can cut-off access at any time they need to and effectively shred the data.

Baffle High-Level Architecture
Figure 3. Baffle High-Level Architecture

Baffle Shield can also provide masking per role-based access control (RBAC). Depending on the application or the users of the application, Shield can determine if READs provide the data in the clear or fully or partially masked. In Figure 4, the hypothetical CCN is 1111-1111-1111-1111 and the SSN is 111-11-111 and illustrates how different applications are provided different access based on need-to-know or least privilege. The figure shows one proxy to many applications, but multiple Shields may be deployed for one proxy per application and/or for multiple database instances for business continuity.

Baffle Masking via RBAC
Figure 4. Baffle Masking via RBAC

Baffle Solution for Data Residency Compliance

With Baffle, the database, and therefore the encrypted data, can be located anywhere. This allows maximum scale and cost/benefit trade-offs that may not have been possible before. Multiple shields can be deployed in the same geographies as the applications. The connections to the DEK and KEK stores would also be in the same location as the Shields, making the keys inaccessible anywhere else. Figure 5 is a hypothetical illustration with the encrypted database in North America connected to applications in different parts of the world. The sensitive data intended for the other locations is never decrypted in North America but allows for the scale and easier management of infrastructure of one location.

If the applications are multi-tenant, the database could be set up with tables using RLK or it could be an entire database instance setup with DLK. Either way, every tenant has their own KEK in their own location and revoke at any time the need should arise. Shields may also provide RBAC/masking for every individual application.

Baffle Shields Deployed with Keys and Applications in Different Geographies
Figure 5. Baffle Shields Deployed with Keys and Applications in Different Geographies

Conclusion

Baffle provides strong encryption and key management for sensitive data. The powerful and efficient architecture requires no coding changes for fast deployment and the flexibility enables scale and easy management while meeting security and privacy compliance needs. Every Baffle customer or their tenants may BYOK for maximum control of their sensitive data.

Sign up for a demo here.