SEC-019: Data Classification in Salesforce: Tagging, Handling, and Governance

What you will learn in this tutorial

Understand the compliance and governance imperatives for data classification.
Configure native Salesforce classification metadata fields at the schema layer.
Establish a tactical, automated roadmap for categorising custom and standard fields.
Develop active Data Loss Prevention (DLP) and Shield encryption policies mapped to classification tags.
Implement a long-term metadata governance model and continuous compliance audits.
Query and analyze field-level classification metadata using Tooling API and Apex.

1. The Imperative of Data Classification in Enterprise Environments

In the contemporary, heavily regulated global economy, data is both an organisation's most valuable asset and its greatest liability. The proliferation of privacy regulations, including GDPR, HIPAA, and CCPA, has transformed data governance from a passive administrative exercise into an active operational requirement. For enterprise Salesforce architectures, the foundation of a robust security posture is the implementation of a comprehensive Data Classification strategy. Without knowing exactly what data resides within the system, who owns it, and how sensitive it is, security officers cannot formulate intelligent, cost-effective data protection controls.

Historically, organisations attempted to address data security through broad, brute-force policies: encrypting all custom fields, applying restrictive field-level security across the board, or enforcing highly constrained sharing models. However, this non-selective approach introduces massive operational friction. Over-encryption degrades system performance, breaks index-based search functionality, and restricts standard reporting. Overly restrictive access controls disrupt employee productivity, leading to workarounds and shadow IT. Data classification enables a surgical, risk-based approach to security. By categorising data based on its regulatory impact and business sensitivity, architects can deploy expensive security tools—such as Shield Platform Encryption, Transaction Security Policies, and real-time monitoring—only where they are truly required, maintaining platform agility while satisfying compliance audits.

Furthermore, data classification is a prerequisite for effective data lifecycle management. Enterprise environments frequently suffer from "metadata bloat," accumulation of deprecated custom fields, and redundant customer databases. A structured classification programme provides visibility into the usage and purpose of every metadata component, enabling systematic data archiving and disposal policies. This reduction in data footprint not only minimizes storage costs but also significantly decreases the organisation's threat surface in the event of an account compromise or integration breach.

2. Salesforce Native Data Classification Metadata: Fields and Architecture

To support enterprise data governance, Salesforce provides native Data Classification metadata fields directly integrated into the schema architecture of standard and custom objects. This feature allows administrators and architects to record critical compliance and ownership data directly on the field definition, embedding governance metadata directly within the platform's metadata catalog.

There are four core classification attributes configured at the individual field level:

Data Owner: Specifies the individual or business department responsible for the lifecycle and business rules of the data in this field. Identifying the data owner is critical for change management; for example, if the Sales Ops team wants to deprecate a custom field, they must consult the designated data owner before executing the deletion.
Field Usage: Captures the operational status of the field, with values including Active, Deprecated, or Hidden. This status helps development teams identify obsolete metadata during refactoring or sandbox synchronization.
Data Sensitivity Level: Defines the risk category of the field. Salesforce offers standard sensitivity levels including Public, Internal, Confidential, and Restricted. Public data represents non-sensitive information that can be exposed externally, while Restricted data represents highly sensitive personal or proprietary information requiring maximum protection.
Compliance Categorisation (ComplianceGroup): Links the field to specific regulatory frameworks. Values include PII (Personally Identifiable Information), HIPAA, GDPR, PCI, and COPPA. A single field can be assigned to multiple compliance groups, enabling compliance officers to run targeted reports to identify all PII or HIPAA-governed fields across the entire org.

Unlike traditional on-premise systems where data classification is stored in separate external spreadsheets or databases, Salesforce's native metadata architecture embeds these classification parameters directly into the core XML definition of each field. This structural embedding means that classification tags are fully integrated with the platform's metadata API, sandbox replication engines, and version control systems. When a custom field is created or updated in a developer sandbox, its sensitivity level and compliance categorization are carried forward through the continuous integration and deployment (CI/CD) pipeline into production. This unified architecture prevents metadata drift, ensuring that data protection attributes are verified and tested in lower environments before being promoted to the live customer database. Furthermore, compliance officers can query these schema properties directly using Tooling API queries or custom Apex metadata inspections, turning the platform's schema catalog into a dynamic, audit-ready compliance ledger.

Architects must recognize that these native classification fields are stored as metadata. They do not, by themselves, encrypt data or restrict access to the field. Rather, they act as a foundational metadata catalog that feeds downstream automated security tools. This configuration metadata is accessible via the Metadata API, Tooling API, and standard Schema describe calls in Apex, allowing for automated compliance reporting and dynamic security enforcement.


// Apex script to query custom field data classification parameters using the Tooling API (conceptual REST execution)
public static void queryFieldClassification() {
    // Developers can programmatically inspect field attributes via Schema describes
    Schema.DescribeFieldResult dfr = Schema.sObjectType.Contact.fields.Social_Security_Number__c.getDescribe();
    System.debug('Field Developer Name: ' + dfr.getName());
    // Security classifications can be read programmatically to drive dynamic UI rendering
    System.debug('Field type: ' + dfr.getType());
}

3. Standardising Field-Level Metadata Classification: A Tactical Roadmap

Implementing data classification across an established Salesforce Enterprise org with thousands of custom fields is a monumental task that cannot be accomplished overnight. It requires a structured, phased tactical roadmap that balances business engagement, metadata auditing, and automation.

The roadmap begins with the **Taxonomy Definition** phase. Before touching a single field configuration in Salesforce, the compliance team, business stakeholders, and architects must establish a unified data sensitivity taxonomy. This taxonomy must define clear, unambiguous criteria for each sensitivity level. For instance, "Confidential" might be defined as any customer data that is not publicly available but carries low risk if exposed internally, while "Restricted" is defined as any data that, if exposed, would trigger legal notification requirements, such as credit card numbers or biometric records.

Crucially, the audit phase must address the reality of technical debt. Over years of rapid deployment, many enterprise orgs accumulate thousands of "zombie" fields—fields that were created for a single campaign or legacy business process and are no longer actively used. Rather than attempting to classify these obsolete components, architects should leverage Field Trip or native optimizer reports to identify fields with low or zero data population rates. These unused fields should be marked as Deprecated under Field Usage and scheduled for safe deletion. For the remaining active schema, architects must establish a formal data sensitivity matrix. This matrix must be approved by the corporate Data Protection Officer (DPO), legal counsel, and business unit leaders, ensuring that the defined levels (Public, Internal, Confidential, Restricted) are consistently interpreted and legally defensible before metadata parameters are modified in production.

Once the taxonomy is finalised, the **Audit and Discovery** phase begins. Architects should utilize the Tooling API or third-party metadata extraction tools to generate a complete inventory of all fields across all major objects. Rather than manually clicking through the Setup UI for thousands of fields, administrators can execute programmatic scripts to identify fields that are highly likely to contain sensitive data based on their names (e.g., fields containing "SSN", "Birth", "Salary", or "Phone"). Business data owners are then assigned to these fields to validate their sensitivity levels. The final phase is **Bulk Application**, where the classification metadata is deployed into production. This is best accomplished using Salesforce DX and the Metadata API. By pulling the custom object metadata files locally, developers can write scripts to inject the classification tags in bulk, and then deploy the updated metadata package back to the org, bypasssing months of manual UI clicks.


-- Conceptual Tooling API SOQL query to inspect data classification attributes on Custom Fields
SELECT DeveloperName, BusinessStatus, SecurityClassification, ComplianceGroup 
FROM CustomField 
WHERE TableEnumOrId = 'Contact'

4. Building Data Egress and Prevention (DLP) Policies Around Classification

The true power of native Data Classification metadata is realized when it is leveraged to drive automated Data Loss Prevention (DLP) and active security enforcement policies. By linking classification tags with Salesforce's security engines, architects can build dynamic, intelligence-driven defenses that adapt to the sensitivity of the data being accessed.

A primary enforcement mechanism is the integration of classification metadata with Event Monitoring's Transaction Security Policies. Transaction Security allows architects to write Apex classes that intercept user activities, such as exporting a report, running a API query, or viewing a record. Within the policy's Apex logic, developers can programmatically inspect the classification of the fields involved in the transaction. For example, if a user attempts to export a report containing more than five fields tagged with a "Restricted" sensitivity level, or any field categorized under the "HIPAA" compliance group, the policy can dynamically block the export, send an alert to the Security Operations Center (SOC), and automatically freeze the user's account pending investigation. This dynamic control protects high-risk fields without restricting reports that contain only public or low-sensitivity data.


// Example of a conceptual Transaction Security policy checking sensitive data access
global class RestrictedDataExportPolicy implements TxnSecurity.EventCondition {
    public boolean evaluate(TxnSecurity.Event e) {
        // Query Setup Audit or schema definitions to identify if target event involves restricted compliance fields
        // If event type is ReportExport, inspect the column list and evaluate sensitivity
        if (e.eventType == 'ReportExport') {
            // Evaluates to true if threshold of restricted fields is exceeded, triggering block action
            return true; 
        }
        return false;
    }
}

In addition to real-time blocking, classification metadata can drive dynamic field masking at the UI layer. By combining data classification tags with custom permission sets or profile rules, architects can configure dynamic rendering engines. If a field is classified as 'Restricted' and belongs to the 'PII' compliance group, the front-end LWC or page layout can dynamically mask the value (e.g., exposing only the last four digits of a social security number) for standard users, while displaying the full unmasked data only to users who have an active, authenticated High Assurance session. This granular control ensures that sensitive data is exposed strictly on a need-to-know basis, satisfying strict compliance frameworks while maintaining standard operational processes for everyday users.

Furthermore, classification directly guides the deployment of Salesforce Shield Platform Encryption. Rather than making arbitrary decisions, architects can define a strict rule: any custom field assigned the "Restricted" sensitivity level and categorized under "PCI" or "PII" compliance must be encrypted using Shield. If a field's sensitivity is downgraded to "Internal," it can be safely decrypted, freeing up database resources and restoring full search functionality. This tight alignment between compliance classification and technical encryption ensures that Shield is deployed optimally, minimising performance overhead while maintaining audit-ready regulatory compliance.

5. Implementing Long-Term Governance and Compliance Audits

Data classification is not a one-time project; it is an ongoing operational commitment. As business models evolve, new custom fields are introduced, and integration architectures change, the data classification database will decay if not supported by a robust, long-term governance framework.

To prevent metadata decay, organisations must establish strict change management controls. Any developer or administrator requesting the creation of a new custom field in production must be required to specify the Data Owner, Field Usage, Sensitivity Level, and Compliance Group as a mandatory field in their user story or ticket. The deployment pipeline should enforce this requirement: static metadata analysis tools can scan XML deployment packages and automatically reject any pull request containing a new custom field that lacks data classification attributes.

Additionally, compliance officers must schedule regular, automated compliance audits. Quarterly reviews should be conducted to run reports on the classification catalog, ensuring that no sensitive fields have slipped into production without classification. These reviews should also track deprecated fields; fields marked as "Deprecated" under Field Usage should be targeted for complete deletion and data archiving within a defined timeframe (e.g., 90 days after deprecation) to clean up the org's schema footprint. By establishing a rigorous governance structure, automating the intake process, and leveraging classification metadata to drive active platform defense, enterprise architectures can confidently navigate complex regulatory landscapes while optimising system performance and developer agility.

Key Takeaways

Data classification enables a risk-based approach to security, preventing the performance overhead of non-selective database encryption.
Salesforce provides native classification metadata fields (Data Owner, Field Usage, Sensitivity, Compliance) directly at the schema layer.
Classification tags are purely metadata attributes and do not encrypt or restrict access without downstream enforcement tools.
Automated pipelines can leverage Transaction Security Policies to dynamically block report exports containing fields classified as 'Restricted'.
A successful classification rollout requires establishing a clear, corporate sensitivity taxonomy before modifying metadata in production.
Long-term governance must include automated metadata validation to ensure all new custom fields are classified before deployment.

Checkpoint: Test Your Understanding

Question 1: Which of the following is a native Salesforce Data Classification metadata attribute configured directly on standard or custom fields?

A. Encryption Algorithm Strength

B. Compliance Categorisation (ComplianceGroup)

C. Dynamic Filtering Priority

D. API Endpoint Authorization Method

Question 2: What is a key benefit of aligning a mature Data Classification strategy with Salesforce Shield Platform Encryption?

A. It automatically encrypts the entire Salesforce database with a single click.

B. It ensures encryption is targeted only at highly sensitive, high-risk fields, avoiding unnecessary functional limits and search restrictions on low-risk fields.

C. It allows unauthenticated guest users to bypass field-level security constraints.

D. It completely eliminates the need for managing customer tenant secrets.

Question 3: How can classification metadata be used to actively block unauthorised data exfiltration via standard reporting?

A. By deleting the custom fields from the report builder interface completely.

B. By writing a Transaction Security Policy in Event Monitoring that evaluates the classification metadata tags of exported columns and blocks the transaction if restricted fields are included.

C. By encrypting all fields with probabilistic encryption schemes.

D. By assigning the 'View Encrypted Data' permission to all standard users.

Data Classification in Salesforce: Tagging, Handling, and Governance

1. The Imperative of Data Classification in Enterprise Environments

2. Salesforce Native Data Classification Metadata: Fields and Architecture

3. Standardising Field-Level Metadata Classification: A Tactical Roadmap

4. Building Data Egress and Prevention (DLP) Policies Around Classification

5. Implementing Long-Term Governance and Compliance Audits

Key Takeaways

Checkpoint: Test Your Understanding

Continue Reading

GDPR Compliance in Salesforce

Handling PII in Salesforce

Encryption in Salesforce: Classic vs Shield

Discussion & Feedback