SEC-009: Handling PII in Salesforce: Anonymisation, Masking, and Deletion

What you will learn in this tutorial

Design a compliant framework for identifying and categorising Personally Identifiable Information (PII) within Salesforce.
Compare the technical trade-offs between data anonymisation, data masking, and permanent deletion.
Master the use of Salesforce Shield Platform Encryption to protect PII at rest without sacrificing system usability.
Build automated Apex batch processes to securely anonymise or delete PII upon customer request.
Implement secure sandbox masking using Salesforce Data Mask to protect production data in non-production environments.
Establish robust audit trails to track PII access, modifications, and erasure operations for regulatory compliance.

1. Personally Identifiable Information and Global Regulatory Mandates

In the digital economy, personal data has become both an invaluable asset and a massive compliance liability. Global privacy regulations such as the General Data Protection Regulation (GDPR) in the European Union, the California Consumer Privacy Act (CCPA/CPRA) in the United States, and the Personal Information Protection and Electronic Documents Act (PIPEDA) in Canada have transformed how organisations must handle customer information. At the centre of these frameworks is Personally Identifiable Information (PII)—any data that can be used to identify, contact, or locate a specific individual. Salesforce, being the primary CRM platform for thousands of global enterprises, routinely stores vast repositories of PII, making it a primary focus for compliance officers and security auditors.

Under regulations like GDPR, individuals are granted fundamental rights regarding their personal data. These rights include the Right of Access (requesting a copy of all stored personal data), the Right to Rectification (updating incorrect records), and, most critically, the Right to Erasure, also known as the Right to Be Forgotten. If a customer exercises their right to erasure, the organisation must completely and permanently remove all traces of their PII from its active systems and backups within strict statutory timeframes. Failure to comply can result in catastrophic financial penalties, reaching up to 4% of global annual turnover or €20 million, whichever is greater.

To build a compliant architecture, tech leaders must first map out where PII resides within their Salesforce database schema. PII is rarely confined to a single field; it is highly distributed across standard objects (such as Lead, Contact, Account, User, and Individual) and custom objects. Common PII fields include names, home addresses, email addresses, phone numbers, passport details, social security numbers, credit card details, and even dynamic IP addresses. Architects must implement a formal data classification process, categorising every field in the system by sensitivity and compliance tags. Salesforce supports native Data Classification metadata fields (such as Data Owner, Field Usage, Data Sensitivity Level, and Compliance Group), which must be systematically applied to ensure visibility and facilitate automated compliance handling.

2. Securing PII at Rest with Salesforce Shield Platform Encryption

Once PII is identified and classified, the next architectural priority is securing that data at rest. While Salesforce encrypts data in transit using standard HTTPS/TLS protocols, protecting the underlying physical database files from unauthorized physical access or database extraction requires encryption at rest. The primary tool for this in the Salesforce ecosystem is Shield Platform Encryption. Unlike classic encryption, which only masks fields on the user interface and has significant limitations, Shield Platform Encryption allows organisations to encrypt sensitive data natively while maintaining critical platform functionality.

When architecting Shield Platform Encryption, a fundamental decision is choosing between Probabilistic and Deterministic encryption. Each method has distinct cryptographic behaviours and significant architectural trade-offs:

Probabilistic Encryption: The most secure form of encryption. It uses a unique random initialization vector for every field value, meaning that the same plaintext input (e.g., "John") will generate completely different ciphertext values each time it is stored in the database. While highly secure, it strictly prevents any database filtering or index matching. If a field is probabilistically encrypted, users cannot use it in SOQL WHERE clauses, report filters, list view search criteria, or duplicate management rules. It is best suited for highly sensitive, non-searchable fields like credit card numbers or passport IDs.
Deterministic Encryption: Address the search limitations of probabilistic encryption by utilising a static, predictable key derivation process. With deterministic encryption, the same plaintext input (e.g., "John") always generates the identical ciphertext value in the database. This allows Salesforce to perform index matches, enabling users to filter records in report criteria, list views, and SOQL queries using exact match operators (e.g., Contact.Email = 'john@example.com'). However, wildcards, partial matches, and case-insensitive searches are still restricted. Deterministic encryption is the recommended standard for searchable PII fields like email addresses, phone numbers, and names.

Architects must carefully evaluate these encryption types and manage the underlying key lifecycle. Shield Platform Encryption operates on a tenant-specific key model, allowing organisations to generate, rotate, and revoke encryption keys on-demand, or even bring their own key (BYOK) generated via external hardware security modules (HSMs). However, encryption adds computational overhead and restricts certain advanced Salesforce platform features, such as criteria-based sharing rules, formula fields referencing encrypted fields, and standard list view sorting. Therefore, encryption should be applied selectively, targeting only true PII fields identified during the classification phase.

3. Erasure vs. Anonymisation vs. UI Masking: Architectural Paradigms

When addressing a customer's request for data removal, architects must choose the appropriate data sanitisation paradigm. A common mistake is assuming that compliance requires the complete deletion of the physical record. In reality, privacy regulations like GDPR permit multiple compliance pathways, each with different technical trade-offs. The three primary paradigms are permanent Deletion, Anonymisation, and UI Masking.

Permanent Deletion (Erasure): In Salesforce, executing a standard DML delete operation (e.g., deleting a Contact record) is a soft-delete. The record is not immediately purged from the database; instead, it is moved to the Recycle Bin, where it remains for 15 days, retrievable by administrators or API clients. To satisfy strict privacy mandates, a soft-delete is insufficient. The record must be permanently expunged. This requires a two-step process: deleting the record and then executing an emptyRecycleBin call to bypass the recovery window. Furthermore, physical deletion can break downstream analytical systems. If a customer contact record is completely deleted, historical financial reports, sales metrics, and activity pipelines lose their referential integrity, leading to distorted business intelligence. Thus, absolute deletion should be reserved for cases where preservation of history is completely unnecessary.

Anonymisation: The preferred architectural pattern for enterprise CRMs. Anonymisation involves overwriting all identifiable PII fields with generic, non-reversible, or randomized placeholder values (e.g., changing FirstName to "Anonymised" and LastName to "Individual_10398"). This satisfies GDPR erasure requirements because the natural person can no longer be identified from the record, either directly or in combination with other data. Critically, anonymisation preserves the structural integrity of the database. The Contact record remains linked to past Opportunities, Cases, and Tasks, allowing financial reports and activity metrics to remain statistically accurate without exposing personal information. The following list contrasts the core differences between erasure, anonymisation, and masking:

Erasure (Hard Delete): Permanent database purge. Destroys relational links, breaks historical analytics, but leaves zero trace of the record.
Anonymisation: Overwrites PII with randomized values. Preserves historical reporting, maintains database referential integrity, and satisfies legal erasure standards.
UI Masking: Obfuscates fields visually on the user interface (e.g., displaying `***-**-6789`) while leaving the underlying database values fully intact in plaintext. Masks protect data from internal user exposure but do not satisfy erasure mandates for external subjects.

By mapping business reporting requirements to these paradigms, tech leaders can implement a compliant privacy strategy that protects both customer confidentiality and critical operational metrics.

4. Building Automated Erasure and Anonymisation Engines via Apex

Executing anonymisation or permanent erasure manually is highly inefficient and prone to operational error. In an enterprise environment receiving hundreds of privacy requests monthly, the compliance pipeline must be fully automated. Architects should design a programmatic engine using Apex batch classes to search for, sanitise, and hard-delete or anonymise customer data systematically.

When designing an automated Apex anonymisation engine, developers must account for cascading relationships. Overwriting PII on the Contact record is useless if the user's name, email, and phone number remain active in related records, such as custom objects, task comments, or audit tables. The Apex class must navigate the relational graph, identifying child objects that also contain PII. The following Apex batch class demonstrates a production-ready, bulk-safe pattern for anonymising customer Contact data whose privacy status is set to 'Erasure Requested'. It overwrites standard PII fields and marks the record as successfully anonymised:

global class ContactPiiAnonymisationBatch implements Database.Batchable, Database.Stateful {
    private Integer processedCount = 0;
    
    global Database.QueryLocator start(Database.BatchableContext BC) {
        // Locate Contacts who have requested erasure and are not yet anonymised
        return Database.getQueryLocator([
            SELECT Id, FirstName, LastName, Email, Phone, MobilePhone, MailingStreet, MailingCity, Privacy_Status__c 
            FROM Contact 
            WHERE Privacy_Status__c = 'Erasure Requested'
        ]);
    }
    
    global void execute(Database.BatchableContext BC, List scope) {
        List contactsToUpdate = new List();
        
        for (Contact c : scope) {
            // Overwrite PII with generic, non-reversible placeholders
            c.FirstName = 'Anonymised';
            c.LastName = 'Individual_' + c.Id;
            c.Email = 'anonymised_' + c.Id + '@invalid-domain.com';
            c.Phone = '0000000000';
            c.MobilePhone = '0000000000';
            c.MailingStreet = 'Anonymised Street';
            c.MailingCity = 'Anonymised City';
            c.Privacy_Status__c = 'Anonymised'; // Transition status
            
            contactsToUpdate.add(c);
            processedCount++;
        }
        
        if (!contactsToUpdate.isEmpty()) {
            try {
                // Perform the updates, bypass triggering of heavy downstream logic where appropriate
                Database.SaveResult[] srList = Database.update(contactsToUpdate, false);
                for (Database.SaveResult sr : srList) {
                    if (!sr.isSuccess()) {
                        for (Database.Error err : sr.getErrors()) {
                            System.debug('Contact Anonymisation Error: ' + err.getMessage());
                        }
                    }
                }
            } catch (Exception e) {
                System.debug('Batch Execution failed: ' + e.getMessage());
            }
        }
    }
    
    global void finish(Database.BatchableContext BC) {
        System.debug('PII Anonymisation process completed. Total records processed: ' + processedCount);
        // Execute supplementary tasks, such as triggering an audit log entry or alerting compliance teams
    }
}

Architects must ensure that this batch class runs within a secure governance context. When anonymising fields, ensure that any history tables tracking changes (like ContactHistory) do not store the original values in plaintext indefinitely. If Field History Tracking is active on encrypted or anonymised fields, work with Salesforce Support or implement Field Audit Trail to define short retention periods for historical tables, ensuring complete data sanitisation across all system storage tiers.

5. Enterprise Sandbox Governance: Hardening Environments with Data Mask

A critical security vulnerability that compliance officers and security audits routinely uncover is data leakage through non-production environments. Sandboxes (Full, Copy, Developer Pro, and Developer) are essential for application development, testing, and training. However, when a sandbox is refreshed, it copies all production data, including active customer PII, to a less-secure non-production environment. Developers, external contractors, and testing teams who lack authorisation to view production customer PII are suddenly granted access to live emails, phone numbers, and addresses in the sandbox.

To eliminate this massive compliance vulnerability, tech leaders must establish a strict sandbox governance policy, utilizing Salesforce Data Mask to sanitise non-production environments automatically. Salesforce Data Mask is an administrative security tool that runs directly inside the newly refreshed sandbox, obfuscating production data using native cryptographic algorithms. By applying Data Mask, organisations can replace raw production PII with realistic, mock datasets. The tool supports three primary masking methodologies, which should be configured according to the sensitivity of each field:

Anonymisation (Substitution): Replaces the production value with a randomly generated but realistic mock value. For example, a real customer email (e.g., `alice.smith@gmail.com`) is replaced with a syntactically correct fake email (e.g., `johndoe123@example.org`). This is the ideal approach for developer sandboxes, as it allows developers to test validation rules, integration triggers, and email formats with realistic data structures without exposing actual customers.
Pseudonymisation (Pattern Masking): Replaces a string with a standardized pattern or characters, preserving some formatting. For example, a real phone number `+44 7700 900077` can be masked to `+44 **** ******`, hiding the identity while preserving the country prefix. This is best for training environments where formatting and data structures are important.
Deletion: Completely purges the field contents, replacing them with a null value. This should be applied to highly sensitive fields that have no utility in development, such as credit card credentials, passport numbers, and bank details.

By integrating Salesforce Data Mask into the standard sandbox refresh checklist, architects can guarantee that developers and offshore testing partners work in fully compliant environments. This mitigates the risk of external data breach and maintains an airtight boundary between production operations and software delivery pipelines.

Key Takeaways

Global privacy mandates like GDPR and CCPA grant individuals clear legal rights to request complete erasure of their PII from active databases.
Conduct a thorough data mapping and classification process using Salesforce's native Data Classification metadata tags to catalog PII.
Selective application of deterministic encryption is required for searchable PII fields, while probabilistic encryption secures non-searchable values.
Prefer Anonymisation over physical database Deletion to preserve database relational integrity and statistical reporting pipelines.
Build robust, bulk-safe Apex batch classes to automate complex data sanitisation and anonymisation workflows for compliance.
Mandate the use of Salesforce Data Mask on newly refreshed sandboxes to obfuscate production PII before granting developer or tester access.

Checkpoint: Test Your Understanding

Question 1: Which encryption type in Salesforce Shield Platform Encryption allows exact-match SOQL queries on encrypted fields?

A. Probabilistic Encryption

B. Deterministic Encryption

C. Symmetric Encryption

D. Asymmetric Encryption

Question 2: Why is data anonymisation generally preferred over permanent deletion (hard delete) for active CRM contact records?

A. Anonymisation does not require any Apex code to execute.

B. It satisfies privacy regulations while maintaining database referential integrity and the statistical accuracy of historical reporting.

C. It automatically re-indexes the search results in the system.

D. It completely disables the Recycle Bin for all system objects.

Question 3: What tool should be used to protect production customer PII from leaking into developer sandboxes during a refresh?

A. Org-Wide Trusted IP Ranges

B. Shield Setup Audit Trail

C. Salesforce Data Mask

D. Visualforce Encryption Mask

Handling PII in Salesforce: Anonymisation, Masking, and Deletion

1. Personally Identifiable Information and Global Regulatory Mandates

2. Securing PII at Rest with Salesforce Shield Platform Encryption

3. Erasure vs. Anonymisation vs. UI Masking: Architectural Paradigms

4. Building Automated Erasure and Anonymisation Engines via Apex

5. Enterprise Sandbox Governance: Hardening Environments with Data Mask

Key Takeaways

Checkpoint: Test Your Understanding

Continue Reading

GDPR Compliance in Salesforce

Data Residency and Hyperforce

Data Classification

Discussion & Feedback