Simulating data updates
Administrators and Data Managers can now test data updates in their Production instance. Source subscription jobs can be run in simulation mode so you can preview the outcome before committing the data to the database. This ensures that you have more control and confidence over the quality of your data operations.
This feature is enabled by default in your Network instance.
About simulation mode
Previously, data loading jobs could be tested if the Apply Updates & Merges setting was not enabled for a subscription. The job stopped after the match stage, so the data wasn't committed in the merge stage (the last stage of the job). Some statistics were available on the Job Details to provide an idea of what the job would do.
Now, when you run a job in simulation mode, the data is simulated during the merge stage. You can preview the results without committing any data to the database.
Examples of outcomes that you can test with simulated data:
Changes to critical fields - For example, address status, relationship type, HCO/HCP type, and so on.
Rejected records - Learn about records that will be dropped from the feed because of data issues.
Records that are rejected are not included in the simulation output, but an error displays in the Job Error Log so you can investigate the issue.
For example, if a reference code is not mapped to Network, a message displays in the log so you can fix the issue. Job errors for rejected records do not display in Test Mode.
Network expression rules - Test NEX rules in the simulated data to ensure the outcome is expected before committing it to the database.
All rule points are supported for simulation mode.
Network Address Inheritance - Review addresses that are synced from Parent HCOs in the output.
Bulk merging entities
Reviewing simulated data
Use simulation mode to preview the updates before committing the data to the database.
Simulated data can be reviewed in the following areas:
Job statistics - Counts for added and updated records display in the Job Summary Report section on the Job Details page.
When the job is run in Test Mode, statistics are not available for the Job Summary Report.
Files - A .zip file is created in the Network file system. It contains a .csv file for each object impacted by the data. For example, if the feed contains updates for HCPs, a .csv file is created for the HCP object and for each of its sub-objects and relationships. The files for all simulation jobs are archived in the FTP folder.
Reporting tables - The simulation output is available as reporting tables. Use the tables to compare the simulated data with production data. Only the tables for the last run simulation job are available. Each time a simulation job starts, the previous simulation tables are replaced.
If you identify issues, you can fix them and prevent unintended data updates from being loaded into your Network instance. When the job outcome is working as expected, run the job again using the Save Changes to the Database option to commit the changes to the production database.
Running a job in simulation mode
Source subscription configurations now include a section called Job Run Outcome at the top of the page.
You can run a job using any of the three options:
Save Changes to the Database - Commit the data updates to the database.
This option replaces the Apply Updates & Merge option that was previously in the General Settings section.
Run Job in Test Mode (Default) - Stop the job after the matching stage so no changes will be applied to the database. You can review the job statistics on the Job Details page.
This behavior is the same as when theApply Updates & Merge option was not selected.
Enable Simulation Mode - Preview the changes to the production data. No changes are committed to the database.
Run a job
To ensure that the source subscription job does not create any data issues, test it in simulation mode. The data is loaded to a restricted area where you can preview the results using reporting.
Open a source subscription (System Interfaces > Source Subscriptions).
In the Job Run Outcome section, choose Run Job in Test Mode and Enable Simulation Mode.
Click Start Job.
In the Start Job confirmation dialog, the Job Run Outcome options display again in case you want to change them.
Tip: If you are running the job using the API, you can use the mode parameter to specify the job outcome. The parameter value will override the configuration outcome only for the job started by the API call.
After the job completes, review the statistics on the Job Details page and the simulated data output.
The following sections on the page are updated to support simulation mode.
Job Settings Summary
The Merge Settings heading identifies the Job Run Outcome setting values that were applied to the job.
Job Result Summary
This section is populated with the simulated statistics of the job after the merge stage. A banner displays to assure you that the data is simulated and that no changes were committed to the database.
In this job, if the data had been committed to the database, one HCP would have been added as a local record, and two HCPs would have been updated.
Note: When a job is set to Test Mode, it stops the job after the match stage so, job statistics display only in the Data Load Summary, Processed Data Summary, and Match Summary. The Job Result Summary section is not updated on the Job Details page.
View simulated output files
In Network Explorer, a new folder in the outbound directory, data_load_simulation, contains the output files for all simulation jobs.
Access to these files is not restricted; if you have access to the folder in the File Explorer, you can view the output files.
Each simulation job creates a .zip file with the following naming convention: <system>-<subscription_name>-data-load-simulation-<timestamp>-job<ID>.zip.
The .zip file contains a .csv file for each entity that was updated or that has new records. The complete entity is included in the output, for example, if HCPs were updated or new HCPs were added (either in the data feed or they were matched and downloaded from OpenData), all of the sub-object and relationship data is added to the simulation output.
Note: If the simulation job does not create records or change any records, an empty .zip file is created.
Review the files
Reviewing the files can help you to see what might have gone wrong during the data load.
In the data_load_simulation directory, click the Download icon the to save the .zip file locally.
Extract the .csv files. A file is created for each entity that has updates or has new records. If an entity was not touched during the job, a file is not created.
Open a .csv file to review the simulated data.
The field values can help you to quickly identify the changes that occurred. For example, in the Address file, you can spot addresses that were inactivated or that were copied (synced) from a parent address through Network Address Inheritance.
Tip: You can also view the .csv files as smart tables in Network.
Considerations for simulated output
Records matched and downloaded from OpenData
If the Match & Download from OpenData setting was on in the source subscription configuration, records that matched will display in the simulated output.
Job details - Records in the source file that are matched with OpenData records display in the Job Result Summary.
if the data would have been committed to the database, one HCP record would have been created and two HCP records would have been downloaded from OpenData.
Simulated output files
For records that matched with OpenData records and would have been downloaded, the simulated files contain the data from OpenData as well as the data included in the source file.
Simulated alternate keys
If new records are created with an alternate key, the key counter is not incremented in simulation mode. Placeholder characters are used to reflect the key format.
In the simulated output, the alternate keys reflect the key format only; an alternate key is not generated.
Alpha-numeric keys: The uppercase letter X is used as a placeholder: If the alternate key format is USZJF-RQK-NKB, the simulated output displays XXXXX-XXX-XXX.
Numeric keys - The number 9 is used as a placeholder. If the alternate key format is 00000-0006, the simulated output displays 99999-9999
Report on simulated data
Reporting tables are created for the simulated data in the SQL Query Editor (Reports). Use the tables to generate reports so you can compare the simulated data to the production data in order to understand which changes the job would have made to your data.
If you have access to the SQL Query Editor, you can view the simulated tables and data. There are no access restrictions; for example, data visibility profiles, field restrictions, and dynamic access control do not apply.
To view the tables:
In the SQL Query Editor, expand the Data Load Simulation category. The simulated files are created as tables with the __s suffix.
Note: Only the tables from the last run simulation job display; the tables are replaced each time a simulation job is run. Reporting tables are not created if a simulation job does not create any new or changed records; the tables from the last job that produced changes are retained.
Expand a table to view the metadata and the fields. The metadata includes the subscription name, the job ID, and the user that created the simulation output.
Important: The tables include all data model fields for each entity so they have the same structure as production tables. This makes it easier to write the queries and compare the data.
Create SQL queries to compare the simulated data with the production data.
Tip: Use left joins to join a simulation table with the production table because you might have new records that do not exist in the production table yet.
Example 1 - HCP query
This query compares specific fields between a simulated HCP table with the production data.
SELECT sim.vid__v, sim.first_name__v AS "sim first name", curr.first_name__v AS "curr first name", sim.last_name__v AS "sim last name", curr.last_name__v AS "curr last name", sim.phone_1__v AS "sim phone 1", curr.phone_1__v AS "curr phone 1" FROM hcp__s sim LEFT JOIN hcp curr ON sim.vid__v = curr.vid__v
In these results, you can see that the first row makes changes to an existing record and the second row creates a new record.
Example 2 - Address query
This query compares the data in the address simulation table to production data.
SELECT sim.vid__v AS "Simulated Address VID", sim.address_ordinal__v AS "Simulated Address Ordinal", sim.is_primary_address__c AS "Simulated Primary Address Flag", sim.address_status__v AS "Simulated Address Status", curr.vid__v AS "Address VID", curr.address_ordinal__v AS "Address Ordinal", curr.is_primary_address__c AS "Primary Address Flag", curr.address_status__v AS "Address Status", sim.entity_vid__v FROM address__s sim LEFT JOIN address curr ON sim.vid__v = curr.vid__v
The results can tell you when ordinals are recalculated because addresses are inactivated.
Identify jobs that ran in simulation mode
Use the merge_simulation column in the job (Job Details) table to identify source subscription jobs that ran in simulation mode.
The column value is either True or False.
Source Subscriptions view
The Source Subscriptions page (System Interfaces) identifies the current configuration for each subscription.
Enabled - The subscription is configured to commit changes to the database.
Disabled - The subscription is configured to run in test mode. No changes will be committed to the database.
Enabled - The subscription is configured to simulate the job output. No changes will be committed to the database.
Disabled - The subscription is configured to run in test mode. No changes will be committed to the database.
Data flow view
After a source subscription is saved, the Data Flow View displays on the configuration page.
The setting values that are applied to the current job display in the Import Files stage.
Considerations for Veeva IDs
When incoming records are not matched to existing records in Network, new records are created for the simulated data. Veeva IDs that are generated for these new records are used only for the simulated output. If you run the subscription again to commit the data to the Network database, new Veeva IDs and are generated for the records.
Duplicate custom keys
If you load an entity with a custom key and that key already exists for a different entity in the production data, an error will display in the Job Error Log. You can fix the issue before you commit the data.
When different entities have the same custom key in an incoming data load, the duplicate keys cannot be detected in the simulated output. These types of duplicates can be detected, but only when the production database is updated.
Some properties that are used for simulated jobs are read-only to ensure that the values do not change if they are copied and pasted between subscriptions.
The following properties are read-only:
These properties cannot be changed in Advanced Mode.
Running jobs using the API
When jobs are triggered externally using the API, use the mode parameter to control the subscription job outcome. This is helpful because you can easily toggle between the different job modes without always having to update the subscription settings.
Note: Using this parameter controls the specific job that is started by the API call; it does not change the subscription configuration.
Possible parameter values
These values can be used for the mode parameter.
default - Run the subscription job using the current subscription configuration.
When one of the following parameter values are used, it overrides the configuration saved in the subscription for this job only.
save_changes - The changes are saved to the database.
test_mode - The job runs in test mode only.
simulation_mode - The job runs in simulation mode.