Pseudonymization of unstructured data for GDPR compliance

Anonymization of data also known as Obfuscation, Masking, Redaction and Pseudonymization, is a data protection strategy that could help organizations struggling to comply with the EU's General Data Protection Regulation (GDPR).

As GDPR goes into effect on May 25, companies with European customers will be liable to steep penalties for failing to comply with its data privacy regulations.

Results of a recent survey mention that a 1/4 of U.S. companies don't feel prepared to meet GDPR compliance policies. To comply with a robust data privacy regulation such as the GDPR, businesses would need to implement multiple different technologies and internal procedures that applies to the collection, management, retention and deletion of the personal data they hold from customers and employees.

While there are a wide array of technologies that can contribute to the compliance of GDPR, providing data discovery, mapping, provisioning, assessment, access control and governance, the key to GDPR compliance is pseudonymization/masking.

What is data anonymization?

Anonymization desensitizes data such as PII (personally identifiable data) by replacing original values with fictitious yet contextually accurate values allowing business to perform key tasks like software testing, development and training, without risking having personal data compromised by an internal or external data breach.

If your information isn't masked or pseudonymized, then you could be not compliant, exposing your business to huge consequences if audited by GDPR regulators.

Why anonymize unstructured data for GDPR?

Data masking or anonymization has been a technique that has been widely adopted in the last decade for test data privacy mainly. Hence some organizations have grown accustomed of masking data in non-production environments to protect sensitive information from data breaches.

Data masking has commonly been applied to structured data across businesses' databases, and many technology vendors have already catered to these needs with solutions such as IBM Optim, Informatica TDM, CA Test Data Manager, Delphix and others. But there is a meaningful gap that most of these solutions fails to bridge: the masking of unstructured data, obfuscating sensitive records in file formats that are classified as unstructured, such as PDFs, MS Office docs, Images, BLOBs/CLOBs, Xrays, HL7s, emails, etc.

GDPR pays close attention to the personal data stored in unstructured documents across the enterprise, as they could be the most vulnerable target for cybercriminals per their lack of built-in security and access management.

Common uses cases:

  • Masking unstructured data in non-production environments like Test, Dev and training.
  • Dynamic masking of unstructured data, redacting on the fly any personal or sensitive data in unstructured files in production environments, desensitizing documents depending on the requester's access levels, device, location, etc.
  • Extending the data masking investments of organizations that have implemented IBM Optim/StoredIQ, Informatica, CA, etc. By integrating UDM to extend the type of data that they could discover and mask.
  • The verticals that would commonly have UDM needs include, Banking (masking bank statements/checks in PDFs) Healthcare (masking Xrays, MRIs, HL7s, etc) Insurance (masking scanned healthcare claims in PDFs or images)

Are you interested in learning how UDM can be leveraged to protect your unstructured data across your entire organization, mitigating risks of data thefts and reputation damage, and preventing steep sanctions from data privacy regulations such as GDPR, HIPAA and PCI DSS? Talk to one of our data privacy experts or request a live demo of Lisbontech UDM.

Other content you might be interested in:


Request a Demo