Salesforce Data Model Design: Objects, Relationships, and the Choices That Haunt You

What you will learn...

The three relationship types in Salesforce and the architectural consequences of choosing each
When to use standard objects vs custom objects — and the traps of over-customising standard objects
Polymorphic lookups and when they create or solve data model problems
Large object design: the performance and query implications of high-volume objects
The "fat object" anti-pattern and how it manifests in organisations with poor data governance
Data model evolution strategies — how to refactor without breaking existing integrations

Relationship Types: The Architectural Consequences

Salesforce has three primary relationship types. The choice between them is not merely a modelling preference — it has direct consequences for data ownership, cascade deletion, roll-up summary fields, and the sharing model. Getting this wrong early creates structural debt that cannot easily be refactored.

Lookup Relationship: A loosely coupled reference between two objects. The child record can exist independently — deleting the parent does not cascade-delete the child (the lookup field is simply nullified, or the delete can be blocked by a cascade setting). Lookup fields do not support roll-up summaries. The child object has its own OwnerId and participates in sharing independently. Use lookup when the child can meaningfully exist without the parent and when you need the flexibility of independent ownership.

Master-Detail Relationship: A tightly coupled ownership relationship. The child's OwnerId is controlled by the parent — the child inherits the parent's ownership and sharing. Deleting the parent cascade-deletes all children. Roll-up summary fields (count, sum, min, max) are available on the parent. An object can have a maximum of two master-detail parents. Use master-detail when the child is genuinely a component of the parent that has no meaning without it, and when roll-up summaries are needed.

Many-to-Many (Junction Object): Not a direct relationship type but a pattern: a junction object with two master-detail relationships to the two objects being related. Salesforce provides no native many-to-many relationship field — this pattern is the standard implementation. The junction object can carry its own fields to describe the relationship itself (start date, role, quantity).

🔑

Master-Detail is a one-way door: You can convert a master-detail relationship to a lookup relationship (provided all existing records meet the lookup validation criteria). You cannot directly convert a lookup to master-detail on an object with data. Converting relationship types on live objects requires a migration strategy — this is one of the choices that haunts orgs for years.

// Querying roll-up data through Master-Detail relationship
SELECT Id, Name, Total_Order_Value__c,
       (SELECT Id, Product_Name__c, Line_Total__c
        FROM Order_Line_Items__r)
FROM Order__c
WHERE Account__c = :accountId

// Roll-up summary on Account from Master-Detail Order:
// SUM(Order_Line_Items__c) is available as a formula-free field
// on the Order parent — impossible with Lookup relationships

Standard Objects vs Custom Objects: The Strategy

Salesforce provides a rich set of standard objects designed to cover common CRM scenarios: Account, Contact, Lead, Opportunity, Case, Contract, Order, Product, Asset, Campaign. The architectural question is: when should you use and extend a standard object vs create a custom object?

Use standard objects when: The business concept maps naturally to the Salesforce CRM model. Accounts are organisations you do business with. Contacts are people at those organisations. Opportunities are revenue-generating deals in progress. Using standard objects for these concepts gives you built-in Salesforce features — the Account hierarchy, the Contact to Opportunity relationship, the standard sales forecasting on Opportunity — that you would have to build manually with custom objects.

Use custom objects when: The concept is genuinely new — a type of entity not represented in the standard model. Manufacturing orders, inspection reports, project milestones, insurance policies — these are business concepts without standard Salesforce equivalents that merit their own objects.

The trap — forcing a misfit into a standard object: The most common data model mistake is using a standard object for a concept it doesn't represent, because it feels close enough. Using the Asset object to represent both customer-owned equipment (correct) and internal IT assets (forced) creates a model where the object serves two different purposes with incompatible field requirements and ownership models. The object accumulates fields for both purposes, validation rules that conflict across purposes, and reports that must filter by type to get meaningful results.

⚠️

The RecordType workaround trap: Adding RecordTypes to a standard object to serve two different business concepts is a warning sign. It is sometimes appropriate (different sales processes on Opportunity), but frequently it signals that two distinct entities are being forced into one object. Ask: "Would someone expect to see these two types in the same list view, report, or search?" If no, they probably belong in separate objects.

Polymorphic Lookups and the WhoId/WhatId Pattern

Salesforce supports polymorphic lookups — lookup fields that can reference different object types depending on the record. The Activity object (Task and Event) uses this: the WhoId field can reference a Contact or a Lead, and the WhatId field can reference an Account, Opportunity, Case, or several other objects.

Custom objects can also have polymorphic lookups. A "Communication Log" custom object might have a "Related To" polymorphic field that can reference Account, Opportunity, or Case depending on the context. This is powerful but introduces query complexity — SOQL queries against polymorphic fields require TYPEOF clauses that add complexity and have performance implications.

The architectural question is: does the polymorphism genuinely reflect business reality (one communication log that relates to different types of records), or is it a shortcut that avoids creating specific relationship fields? Specific lookup fields (Account__c, Opportunity__c, Case__c as separate nullable fields) are simpler to query and validate than polymorphic lookups, at the cost of some model elegance. Choose based on query frequency and the importance of type-agnostic access patterns.

The Fat Object Anti-Pattern

The "fat object" anti-pattern occurs when a single Salesforce object accumulates hundreds of fields over time, often covering distinct business concepts that should have been modelled as separate objects or at least as structured related data.

Fat objects emerge gradually. Phase 1 of a project adds the essential fields. Phase 2 adds "a few more fields" for a new process. Over years, the object accumulates 150, 200, 300+ fields. Page layouts become overwhelming. The object's definition includes fields for 12 different processes, half of which are irrelevant for any given record type. SOQL queries selecting "SELECT *" equivalents become slow because the database is reading wide rows.

The signals of a fat object are visible: objects approaching or exceeding 200 custom fields, page layouts with "Field Sets" just to manage the number of fields shown, reports that require complex filters to produce meaningful results, and developers who need to spend 10 minutes understanding which fields matter for a specific use case.

💡

Refactoring fat objects: The treatment for a fat object is to identify field clusters that belong to distinct processes or concepts, extract them into related child objects, and migrate historical data. This is expensive but sometimes unavoidable. The better intervention is architectural governance that reviews new field requests against data model principles before creation — preventing fat objects before they form.

Large Object Design and Query Performance

Objects that accumulate millions or tens of millions of records require specific design attention. Query performance on large objects degrades without proper indexing and query selectivity — SOQL that works fine on 100,000 records becomes unacceptably slow or hits timeout limits on 50 million records.

Salesforce automatically indexes the Id field, all external ID fields, all custom fields marked as unique, and a small set of standard fields. For frequently-queried fields that are not automatically indexed, a custom index can be requested from Salesforce support. Custom indexes dramatically improve query performance for selective queries (queries where the filter returns less than ~5-10% of total records).

Non-selective queries — queries that return a large proportion of an object's records — cannot use indexes effectively regardless of indexing configuration. A query like "SELECT Id FROM Order__c WHERE Status__c != 'Closed'" on an object with 20 million records where 95% have Status = 'Open' is non-selective and will be slow or fail. Design queries to be selective by construction, not by adding post-hoc indexes.

// Query optimization — selective filter first
// Non-selective (bad): returns most records on a large table
SELECT Id, Account__c, Total_Amount__c
FROM Order__c
WHERE Status__c != 'Closed'

// Selective (better): date constraint limits result set
SELECT Id, Account__c, Total_Amount__c
FROM Order__c
WHERE CreatedDate = LAST_N_DAYS:30
AND Status__c != 'Closed'
// Use LIMIT if you don't need all results:
LIMIT 10000

💡

Skinny tables: For objects with very high query volume on a specific subset of fields (e.g., a list view that always queries the same 5 fields), Salesforce can create a "skinny table" — a narrow database table containing only the frequently-accessed fields, dramatically reducing the I/O needed for those queries. Skinny tables are requested via Salesforce support and are available for qualifying high-volume scenarios.

Data Model Evolution and Refactoring Strategy

Data models evolve. Business processes change, acquisitions bring new data requirements, and early architectural decisions prove insufficient at scale. The challenge is evolving the model without breaking integrations, workflows, and the accumulated tooling built on top of the original design.

Additive changes are safe: Adding new fields, adding new objects, adding new relationships to existing objects — these are additive and do not break anything that was already working. This is why data model changes should default to addition where possible.

Renaming and retiring are risky: Renaming a field changes its API name if done incorrectly, breaking any hardcoded references in code, integrations, and reports. In Salesforce, the API name and the label are separate — you can change the label without changing the API name. Retiring a field (deleting it) requires auditing all references first: SOQL queries, Apex code, flows, reports, and external systems that reference the field via API.

Relationship type changes are the hardest: Converting between relationship types on objects with data requires a migration approach: new field creation, data migration via Apex/Data Loader, cutover, old field retirement. Each step requires coordination with integration systems that may reference the old field. The timeline for major relationship type changes is measured in weeks or months, not hours.

Key Takeaways

Master-Detail creates tight ownership coupling (shared OwnerId, cascade delete, roll-up summaries); Lookup is loosely coupled (independent ownership, no cascade). This choice is hard to reverse with data — make it deliberately.
Use standard objects when the concept maps naturally to the CRM model; use custom objects when the concept is genuinely new. Forcing unrelated concepts into one standard object (via RecordTypes) is a common source of long-term data model pain.
Polymorphic lookups enable type-agnostic relationships but introduce SOQL complexity. Evaluate whether specific named relationship fields are simpler and more queryable before defaulting to polymorphism.
The fat object anti-pattern — a single object accumulating hundreds of fields across unrelated processes — creates usability, performance, and maintainability problems. Prevent it with architectural governance; treat it with field extraction and data migration.
Large objects (millions of records) require selective queries and custom indexes for performance. Non-selective queries that return most of an object's records are slow regardless of indexing — design queries to be selective by construction.
Data model evolution should default to additive changes. Renaming fields risks breaking API consumers; retiring fields requires full reference auditing; changing relationship types requires migration planning measured in weeks.

Test Your Understanding

1. A developer creates an Order Line Item object with a Master-Detail relationship to Order. Later, the business requires that line items can exist without a parent order (as drafts). What is the problem?

No problem — Master-Detail allows null parent values for records in draft status

Master-Detail fields cannot be null — every child must have a parent. The relationship type must be converted to Lookup to allow parent-less records, which requires a migration.

The developer can simply mark the Master-Detail field as optional in the field definition to allow blank values

2. A SOQL query on a 30-million-record object runs: SELECT Id FROM Transaction__c WHERE Processing_Date__c > LAST_N_DAYS:30. Processing_Date__c is indexed. The query returns records from the last 30 days only (about 2% of total). What determines whether the index is used?

Indexes are never used on date/time fields in Salesforce — only text and ID fields support indexing

This query is selective (2% of records) — the index will be used. Salesforce uses indexes when the result set is approximately 5-10% or less of the total records in the object.

The query must use ORDER BY to benefit from an index in Salesforce SOQL

3. An architect sees an Account object with 280 custom fields across 15 different RecordTypes, where each RecordType uses approximately 20 of those fields. What anti-pattern does this represent, and what is the recommended treatment?

The sharing model anti-pattern — 15 RecordTypes require 15 different sharing rules that will conflict under certain org-wide defaults

The fat object anti-pattern — multiple distinct business concepts have been forced into one object. Treatment is to identify field clusters belonging to each concept and extract them into related child objects or separate custom objects.

No anti-pattern — Salesforce is designed to support many RecordTypes on a single object; 280 fields is within platform limits

Salesforce Data Model Design: Objects, Relationships, and the Choices That Haunt You

Relationship Types: The Architectural Consequences

Standard Objects vs Custom Objects: The Strategy

Polymorphic Lookups and the WhoId/WhatId Pattern

The Fat Object Anti-Pattern

Large Object Design and Query Performance

Data Model Evolution and Refactoring Strategy

Key Takeaways

Test Your Understanding

Continue Reading

Flow vs Apex: The Architectural Decision Framework

Experience Cloud Architecture: What's Actually Happening Under the Hood

Large Data Volumes: How Salesforce Handles 100M+ Records

Discussion & Feedback