Does Real Queryable Encryption mean there is a Fake Queryable Encryption?

Analytics, Databases, Encryption, Homomorphic Encryption, MySQL, PostgreSQL, Privacy Enhanced Computation, Real Queryable Encryption

Billy VanCannonApril 30, 2024

TLDR

There is no “fake” queryable encryption, but a certain NoSQL database provider has coined the term “Queryable Encryption”, and Baffle has something to say about that.

Background

To understand real queryable encryption, we must first understand the encryption offerings that already exist.

The first encryption offerings to examine are database-side operations. As the name implies, the data is encrypted in the database. The two most common are transparent data encryption (TDE), where the database software encrypts the data before it is saved to disk, and full disk encryption (FDE), where the underlying operating system does the encryption before it is saved to disk. As far as anyone who can access the database is concerned, both are “transparent” because it happens automatically and without the user’s knowledge. These approaches to encryption were introduced decades ago when physical theft of hard drives was the primary concern. If somebody steals the encrypted drives and inserted them into their own computers, the data is protected. While physical theft is still a concern today, it is responsible for only 8% of data leaks. Hackers gaining remote access are responsible for 80% of data leaks with malicious insiders coming in at 6%.¹ Despite the convenience of database-side encryption, it clearly falls short for security.

The next encryption offerings to investigate are client-side operations. Here, the encryption is done on the application before writing to the database and ensures direct attacks on the database are completely protected. However, there are several downfalls. The first is that the application must be written to either do the encryption itself or connect to an external service (API), but in either case, this requires code changes. This may be very difficult for older applications and impossible if using a third-party application. The second issue is that the application is responsible for key management as well as the encryption itself, which can be an entire additional layer of effort to implement and manage going forward. Yet another issue with client-side encryption is a lack of centralized control. Imagine managing many applications across an enterprise, all separately determining what data to encrypt with what algorithms, and key mapping. The final issue with client-side encryption is that only equality matches (i.e. find or select a social security number) are possible when querying the database. Any operations that the database used to be able to do – such as sort, search, or math – are only possible if the application pulls all the relevant data and then performs these operations. This is potentially much more coding, very inefficient from a computing perspective and a security risk while the application handles all this data.

To solve the issues with database-side queries, an entire area of research has developed to solve this “privacy enhanced computation” problem. There are several versions, including homomorphic encryption, secure multi-party computation, secure hardware enclaves, and now, Queryable encryption. However, all of these approaches have considerable performance or scaling issues.

Queryable Encryption

MongoDB has long offered client-side encryption, but it only worked on deterministic encryption. For Queryable encryption (QE), additional modifications (aka “Structured Encryption”) were done when writing records to the database. In a very simplified explanation, cryptographic metadata is added to the documents (being NoSQL) that enables additional search functionality on randomly encrypted data. However, the queries are still equality match only in the first release of QE (General availability in MongoDB release 7.X)

Future QE releases (8.X+) promise database-side operations that includes:

Range – return entries where all dates after X and before Y, or all transactions greater than $X
Prefix – return entries where first X characters match
Suffix – return entries where last X characters match
Substring – return entries with that contain a substring

This is a fantastic step forward for NoSQL database security, but that is the entire list of planned operations as of today. They also note that QE potentially increases storage needs by up to 5X and large writes can be 5-10X slower.

Real Queryable Encryption

Baffle’s Real Queryable Encryption is with respect to SQL databases, so that is the most important thing to point out here. SQL and NoSQL databases are best for different applications, data, and use cases. But if a SQL database is the correct solution, the entire point of SQL is to provide a standardized set of queries and functions that make a structured database so powerful.

Baffle implements a reverse proxy that operates at the SQL session layer between the application and database. At the highest level, the proxy encrypts when data is written to the database and decrypts when data is read. The proxy implements two-tier key encryption and manages all the key-data mapping. If Baffle stopped there, we would provide client-side security without the downfalls of application code changes and key management. But we didn’t stop there.

Baffle additionally adds user-defined functions (UDF) to the database that can communicate with the reverse proxy. When the proxy gets a SQL query, it breaks that query down. Operations on unencrypted data are simply passed on. Exact matches on sensitive data are encrypted/decrypted as required. However, database side operations are pulled out, modified, and sent to the UDF. The UDF decrypts the necessary data, performs the operations, and sends the results back to the proxy.

This approach enables a complete set of database-side operations including indexing, sorting, searching, and mathematical operations. Direct attacks on the database are neutralized. Even the DBA can’t access the data, and this includes any cloud infrastructure admins if the database is in the cloud.

Conclusion

Baffle is the easiest way to encrypt SQL databases. The Baffle architecture provides the security of client-side encryption with the convenience and power of database-side encryption. No code changes or key management to worry about while still enabling highly performant operations. All while protecting against remote and physical attacks on the database.

Request a Baffle Demo for Data Protection Solutions

¹IBM Cost of Data Breach Report 2023