Is Tokenization a Good Way to Protect Your Data?

By Harold Byun, VP Products | April 22, 2021

This blog looks at vault-based data tokenization methods and some key challenges on using such approaches in modern compute environments.

Data tokenization is a method that transforms a data value to a token value and is often used for data security and compliance.  There are basically three approaches to enabling data tokenization:

  1. Vault-based Tokenization
  2. Vaultless Tokenization
  3. Format Preserving Encryption (FPE)

Vault-based tokenization operates by creating a dictionary of sensitive data values and mapping them to token values which replace the original data values in a database or data store.  The original sensitive data and the token values are then stored in a hardened vault and when an application or user needs access to the original value, a look up is performed against the dictionary and the token is reversed.

The benefit of using tokenization, as with other data-centric security measures, is that the original data values are not stored in clear text a data store.  While this method provides a viable means of protecting data and addressing compliance, there are some key issues to consider related to the actual security provided and operationalization of the solution.

Threat Mitigation and Risk Transfer

One key challenge with vault-based tokenization is that the solution is simply moving the sensitive data from one data store to another.  This may protect against a primary data repository, but in reality, is creating a mapping of your sensitive data to token values that simply lives in another location.  This may help transfer some risk or address a specific data storage environment, but it is ultimately replicating the sensitive data issue in another place.  Further, if high availability needs to be implemented, the token vault will need to be replicated as well.

Application Changes Required

In order to leverage a token vault, one will also need to alter applications and queries that look to access tokenized data.  This requires application changes across the environment which can be costly and time consuming.  Furthermore, in cloud native environments, distributed data environments, and microservices environments, instrumenting these changes can slow down application deployments because of the number of touch points involved.

Performance and Scalability Challenges

Think for a moment about how much data your organization has.  Then, consider what it would mean if every time you needed to ask for a piece of information, you needed to walk down the hall and talk to your work buddy to get an answer.  While this may exaggerate the architecture of a token vault call, this is effectively what’s happening every time an application or user is asking for a piece of sensitive data.

They need to call out to the vault and wait for answer — in other words, there’s an extra two hops every time this lookup needs to occur.  In today’s world, very few organizations are collecting less information or data.  What this means is that vault-based tokenization is an inherently unscalable approach to data security in the modern world.

Data Security Methodology

Another challenge with data tokenization is that the method uses a proprietary encryption or data transform method.  In general, proprietary, non-vetted methods to protect data are frowned upon in the cryptography world.  The general principle which many cryptographers and security analysts operate under is that the attacker should know the cryptographic method, and in spite of knowing the method, the attacker still cannot break the encryption or transform mechanism to obtain secret values.  With proprietary techniques, there is no such guarantee or peer review of the method.

Summary

There are several methods available to protect your data, and there is no panacea for data security.  However, there are some distinct advantages and disadvantages with various tokenization, de-identification and encryption techniques.  When evaluating solutions, you should definitely consider performance and scaling impact, openness and transparency of the security method, and ease of integration and operationalization.

Learn more about our supported encryption modes here.

Learn about Baffle Data Protection Services here.

Request a Demo if you’d like to see the simplicity of Baffle DPS.