In this article by Andrew Fawcett, author of the book Force.com Enterprise Architecture - Second Edition, we will discuss how it is important to consider your customers' storage needs and use cases around their data creation and consumption patterns early in the application design phase. This ensures that your object schema is the most optimum one with respect to large data volumes, data migration processes (inbound and outbound), and storage cost. In this article, we will extend the Custom Objects in the FormulaForce application as we explore how the platform stores and manages data. We will also explore the difference between your applications operational data and configuration data and the benefits of using Custom Metadata Types for configuration management and deployment.
(For more resources related to this topic, see here.)
You will obtain a good understanding of the types of storage provided and how the costs associated with each are calculated. It is also important to understand the options that are available when it comes to reusing or attempting to mirror the Standard Objects such as Account, Opportunity, or Product, which extend the discussion further into license cost considerations. You will also become aware of the options for standard and custom indexes over your application data. Finally, we will have some insight into new platform features for consuming external data storage from within the platform.
In this article, we will cover the following topics:
- Mapping out end user storage requirements
- Understanding the different storage types
- Reusing existing Standard Objects
- Importing and exporting application data
- Options for replicating and archiving data
- External data sources
Mapping out end user storage requirements
During the initial requirements and design phase of your application, the best practice is to create user categorizations known as personas. Personas consider the users' typical skills, needs, and objectives. From this information, you should also start to extrapolate their data requirements, such as the data they are responsible for creating (either directly or indirectly, by running processes) and what data they need to consume (reporting). Once you have done this, try to provide an estimate of the number of records that they will create and/or consume per month.
Share these personas and their data requirements with your executive sponsors, your market researchers, early adopters, and finally the whole development team so that they can keep them in mind and test against them as the application is developed.
For example, in our FormulaForce application, it is likely that managers will create and consume data, whereas race strategists will mostly consume a lot of data. Administrators will also want to manage your applications configuration data. Finally, there will likely be a background process in the application, generating a lot of data, such as the process that records Race Data from the cars and drivers during the qualification stages and the race itself, such as sector (a designated portion of the track) times.
You may want to capture your conclusions regarding personas and data requirements in a spreadsheet along with some formulas that help predict data storage requirements. This will help in the future as you discuss your application with Salesforce during the AppExchange Listing process and will be a useful tool during the sales cycle as prospective customers wish to know how to budget their storage costs with your application installed.
Understanding the different storage types
The storage used by your application records contributes to the most important part of the overall data storage allocation on the platform. There is also another type of storage used by the files uploaded or created on the platform. From the Storage Usage page under the Setup menu, you can see a summary of the records used, including those that reside in the Salesforce Standard Objects.
Later in this article, we will create a Custom Metadata Type object to store configuration data. Storage consumed by this type of object is not reflected on the Storage Usage page and is managed and limited in a different way.

The preceding page also shows which users are using the most amount of storage. In addition to the individual's User details page, you can also locate the Used Data Space and Used File Space fields; next to these are the links to view the users' data and file storage usage.
The limit shown for each is based on a calculation between the minimum allocated data storage depending on the type of organization or the number of users multiplied by a certain number of MBs, which also depends on the organization type; whichever is greater becomes the limit. For full details of this, click on the Help for this Page link shown on the page.
Data storage
Unlike other database platforms, Salesforce typically uses a fixed 2 KB per record size as part of its storage usage calculations, regardless of the actual number of fields or the size of the data within them on each record. There are some exceptions to this rule, such as Campaigns that take up 8 KB and stored Email Messages use up the size of the contained e-mail, though all Custom Object records take up 2 KB. Note that this record size also applies even if the Custom Object uses large text area fields.
File storage
Salesforce has a growing number of ways to store file-based data, ranging from the historic Document tab, to the more sophisticated Content tab, to using the Files tab, and not to mention Attachments, which can be applied to your Custom Object records if enabled. Each has its own pros and cons for end users and file size limits that are well defined in the Salesforce documentation.
From the perspective of application development, as with data storage, be aware of how much your application is generating on behalf of the user and give them a means to control and delete that information. In some cases, consider if the end user would be happy to have the option to recreate the file on demand (perhaps as a PDF) rather than always having the application to store it.
Reusing the existing Standard Objects
When designing your object model, a good knowledge of the existing Standard Objects and their features is the key to knowing when and when not to reference them. Keep in mind the following points when considering the use of Standard Objects:
- From a data storage perspective: Ignoring Standard Objects creates a potential data duplication and integration effort for your end users if they are already using similar Standard Objects as pre-existing Salesforce customers. Remember that adding additional custom fields to the Standard Objects via your package will not increase the data storage consumption for those objects.
- From a license cost perspective: Conversely, referencing some Standard Objects might cause additional license costs for your users, since not all are available to the users without additional licenses from Salesforce. Make sure that you understand the differences between Salesforce (CRM) and Salesforce Platform licenses with respect to the Standard Objects available. Currently, the Salesforce Platform license provides Accounts and Contacts; however, to use the Opportunity or Product objects, a Salesforce (CRM) license is needed by the user. Refer to the Salesforce documentation for the latest details on these.
Use your user personas to define what Standard Objects your users use and reference them via lookups, Apex code, and Visualforce accordingly. You may wish to use extension packages and/or dynamic Apex and SOQL to make these kind of references optional. Since Developer Edition orgs have all these licenses and objects available (although in a limited quantity), make sure that you review your Package dependencies before clicking on the Upload button each time to check for unintentional references.
Importing and exporting data
Salesforce provides a number of its own tools for importing and exporting data as well as a number of third-party options based on the Salesforce APIs; these are listed on AppExchange. When importing records with other record relationships, it is not possible to predict and include the IDs of related records, such as the Season record ID when importing Race records; in this section, we will present a solution to this.
Salesforce provides Data Import Wizard, which is available under the Setup menu.

This tool only supports Custom Objects and Custom Settings. Custom Metadata Type records are essentially considered metadata by the platform, and as such, you can use packages, developer tools, and Change Sets to migrate these records between orgs. There is an open source CSV data loader for Custom Metadata Types at https://github.com/haripriyamurthy/CustomMetadataLoader.
It is straightforward to import a CSV file with a list of race Season since this is a top-level object and has no other object dependencies. However, to import the Race information (which is a child object related to Season), the Season and Fasted Lap By record IDs are required, which will typically not be present in a Race import CSV file by default. Note that IDs are unique across the platform and cannot be shared between orgs.
External ID fields help address this problem by allowing Salesforce to use the existing values of such fields as a secondary means to associate records being imported that need to reference parent or related records. All that is required is that the related record Name or, ideally, a unique external ID be included in the import data file.
This CSV file includes three columns: Year, Name, and Fastest Lap By (of the driver who performed the fastest lap of that race, indicated by their Twitter handle). You may remember that a Driver record can also be identified by this since the field has been defined as an External ID field.
    
        Unlock access to the largest independent learning library in Tech for FREE!
        
            
                Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
                Renews at €18.99/month. Cancel anytime
             
            
         
     
 

Both the 2014 Season record and the Lewis Hamilton Driver record should already be present in your packaging org. Now, run Data Import Wizard and complete the settings as shown in the following screenshot:

Next, complete the field mappings as shown in the following screenshot:

Click on Start Import and then on OK to review the results once the data import has completed. You should find that four new Race records have been created under 2014 Season, with the Fasted Lap By field correctly associated with the Lewis Hamilton Driver record.
Note that these tools will also stress your Apex Trigger code for volumes, as they typically have the bulk mode enabled and insert records in chunks of 200 records. Thus, it is recommended that you test your triggers to at least this level of record volumes.
Options for replicating and archiving data
Enterprise customers often have legacy and/or external systems that are still being used or that they wish to phase out in the future. As such, they may have requirements to replicate aspects of the data stored in the Salesforce platform to another. Likewise, in order to move unwanted data off the platform and manage their data storage costs, there is a need to archive data.
The following lists some platform and API facilities that can help you and/or your customers build solutions to replicate or archive data. There are, of course, a number of AppExchange solutions listed that provide applications that use these APIs already:
- Replication API: This API exists in both the web service SOAP and Apex form. It allows you to develop a scheduled process to query the platform for any new, updated, or deleted records between a given time period for a specific object. The getUpdated and getDeleted API methods return only the IDs of the records, requiring you to use the conventional Salesforce APIs to query the remaining data for the replication. The frequency in which this API is called is important to avoid gaps. Refer to the Salesforce documentation for more details.
- Outbound Messaging: This feature offers a more real-time alternative to the replication API. An outbound message event can be configured using the standard workflow feature of the platform. This event, once configured against a given object, provides a Web Service Definition Language (WSDL) file that describes a web service endpoint to be called when records are created and updated. It is the responsibility of a web service developer to create the end point based on this definition. Note that there is no provision for deletion with this option.
- Bulk API: This API provides a means to move up to 5000 chunks of Salesforce data (up to 10 MB or 10,000 records per chunk) per rolling 24-hour period. Salesforce and third-party data loader tools, including the Salesforce Data Loader tool, offer this as an option. It can also be used to delete records without them going into the recycle bin. This API is ideal for building solutions to archive data.
Heroku Connect is a seamless data synchronization solution between Salesforce and Heroku Postgres. For further information, refer to https://www.heroku.com/connect.
External data sources
One of the downsides of moving data off the platform in an archive use case or with not being able to replicate data onto the platform is that the end users have to move between applications and logins to view data; this causes an overhead as the process and data is not connected.
The Salesforce Connect (previously known as Lightning Connect) is a chargeable add-on feature of the platform is the ability to surface external data within the Salesforce user interface via the so-called External Objects and External Data Sources configurations under Setup. They offer a similar functionality to Custom Objects, such as List views, Layouts, and Custom Buttons. Currently, Reports and Dashboards are not supported, though it is possible to build custom report solutions via Apex, Visualforce or Lightning Components.
External Data Sources can be connected to existing OData-based end points and secured through OAuth or Basic Authentication. Alternatively, Apex provides a Connector API whereby developers can implement adapters to connect to other HTTP-based APIs. Depending on the capabilities of the associated External Data Source, users accessing External Objects using the data source can read and even update records through the standard Salesforce UIs such as Salesforce Mobile and desktop interfaces.
Summary
This article explored the declarative aspects of developing an application on the platform that applies to how an application is stored and how relational data integrity is enforced through the use of the lookup field deletion constraints and applying unique fields.
Upload the latest version of the FormulaForce package and install it into your test org. The summary page during the installation of new and upgraded components should look something like the following screenshot. Note that the permission sets are upgraded during the install.


Once you have installed the package in your testing org, visit the Custom Metadata Types page under Setup and click on Manage Records next to the object. You will see that the records are shown as managed and cannot be deleted. Click on one of the records to see that the field values themselves cannot also be edited. This is the effect of the Field Manageability checkbox when defining the fields.

The Namespace Prefix shown here will differ from yours.
Try changing or adding the Track Lap Time records in your packaging org, for example, update a track time on an existing record. Upload the package again then upgrade your test org. You will see the records are automatically updated. Conversely, any records you created in your test org will be retained between upgrades.
In this article, we have now covered some major aspects of the platform with respect to packaging, platform alignment, and how your application data is stored as well as the key aspects of your application's architecture.
Resources for Article:
Further resources on this subject: