Adding a source subscription

AD
DM

Source subscriptions enable you to configure data loads into Network for matching and merging with Customer Master data or Veeva OpenData. Each source subscription must be named and linked to a system.

Prerequisite

If the source system uses code values that are different from Network's values, add reference aliases to map the codes. For more information, see Reference aliases.

Choose subscription type

  1. In the Admin console, click System Interfaces > Source Subscriptions.
  2. At the top of the Source Subscriptions page, click Add Subscription.
  3. In the list, choose one of the options for creating the subscription:

    • Use Wizard Mode - Use the wizard to simplify the process of defining the advanced properties, field normalization, and model map. When these settings have been configured, the classic subscription opens so you can continue defining match rules, Network expression rules, and so on.

      If you choose this option, follow the steps in the Source subscription wizard topic.

    • Use Classic Mode - Create the entire subscription using the traditional configuration. Manually define all of the settings, field normalization, and model map.

      If you choose this option, continue with the steps below.

Choose the job outcome

You can run a job using any of the three options:

  • Save Changes to the Database - Commit the data updates to the database.

    This option replaces the Apply Updates & Merge option that was previously in the General Settings section.

  • Run Job in Test Mode (Default) - Stop the job after the matching stage so no changes will be applied to the database. You can review the job statistics on the Job Details page.

    This behavior is the same as when the Apply Updates & Merge option was not selected.

    • Enable Simulation Mode - Preview the changes to the production data. No changes are committed to the database.

    For more information, see Simulating data updates.

Define subscription details

The Details section of the source subscription defines basic information for the subscription.

Provide the following information:

Name: This value is mandatory and must be short. Special characters and spaces are not supported; use underscores in place of spaces.

System: Select the system name for the data source that will be linked to the subscription. To import data change requests from a third party data source, a third party system must be selected.

Code: This does not display until the source subscription is created. The code is automatically generated from the name of the source subscription and is used for exporting Network configurations. The code does not change if the name changes. For more information, see View exported configuration records.

Description: Provide meaningful information about this subscription. A maximum of 200 characters is supported.

Status: New subscriptions are Enabled by default. They can be disabled when they are no longer regularly used. For more information, see Disabling subscriptions.

Set subscription options

Use the controls in the Settings section to manage the job outcomes, logging levels, and match outcomes.

In the Settings section for the subscription, you can configure the following controls, depending on the source system:

General settings

  • Allow File Reprocessing: Process the feedClosed A collection of files within a folder or archive that are processed during data load using a source subscription. They reside on the FTP site for the Network instance in which the source subscription is configured. again, even if the files in the feed were previously processed. When this option is set, all files in the source folder will be loaded, regardless of whether or not they have previously been loaded. This setting is enabled by default.

    If Allow File Reprocessing is not set, only new feeds (that have not previously been loaded) are processed. When this option is off, subsequent loads with the same subscription require one of the following:

    • update the FTP Path of the subscription to another folder containing the new feed files; or
    • change the system associated with the subscription

    If these requirements are not satisfied, subsequent load attempts will not be successful. You can re-enable Allow File Reprocessing after an unsuccessful load to reprocess the existing feed without changing the path or its contents.

    Note: The feed can only contain one file matching any aliases defined in the Source Files section. For example; multiple files matching an HCP alias of hcp* will cause the subscription job to fail. Ensure that the feed location contains no more than one file matching the alias.

  • Job Error Log: Export the error log for the source subscription to your FTP server. You should clear this after first use, as the error log takes time to produce.

    If the job produces errors and this setting is on, you can download the error log directly from the Job Details page, or retrieve it from FTP in the logs/<source system> directory.

  • Enable All Job Validation Rules - Run job validation rules to ensure that critical changes do not occur on a large number of records. Click the link to navigate to the Job Validation Rules page to review the enabled rules.

    For more information, see Job validation rules.

  • Allow Auto-Archive - Automatically archive the source files when a job completes.

    The source files are automatically moved from the inbound folder to the auto-archive folder on your Network FTP server if the job completes or completes with warnings. Source files are not archived if the job fails.

    For more information, see Archiving source files.

Match settings

  • Action for Unmatched & Suspect Match: The action to perform for unmatched records or suspect matchesClosed The identification of multiple records that are suspected to be duplicates of one another, but without high enough confidence for automatic resolution.. If the candidate records feature is not enabled, this list has two options: Create Valid Records and No Action (do not load unmatched records).

    • Create Valid Records - All incoming records are created as valid records in your Network instance.
    • No Action - Suspect match tasks will not be created for that subscription. Unmatched records and suspect matches are ignored and not added to your Network instance.
    • Create Candidate Records - This option is available only if candidate records are enabled for your Network instance. Incoming records that are unmatched or suspect matched are flagged as candidate records.
  • Source Dedupe: Choose whether to deduplicateClosed The process of merging duplicate records. objects or sub-objects that loaded in this source subscription. All Veeva standard objects and sub-objects and custom objects and sub-objects are supported. For more information, see Deduplicating source data.
  • DCR Response File Provided: Displays only when a third party system is selected for the subscription.

    For third party systems, select to import DCRs. The DCR response files will be processed. The DCR file names must start with “DCR_” and must be stored in the inbound FTP folder of your Network instance.

    For more information about the DCR response file, see Importing DCRs for third party systems.

  • Match & Download from OpenData: Automatically matches (ACTClosed A high confidence match between two records. ACT matches result in a merge without any human review. only) the records in your Network instance with Veeva OpenData. For more information, see Match & Download from Veeva OpenData.

    Note: Matching with Veeva OpenData only occurs if it is enabled (General Settings) for a Network instance that subscribes to Veeva OpenData for a country.

Export settings

  • Data Group Analysis: Export the data group details from this source subscription to your FTP server. Select the object to log. All Veeva standard objects and custom objects are supported. For more information, see Export data group analysis.

    Note: The resulting candidate records will be searchable locally; however, they will not be synchronized with or be searchable from downstream systems.

  • Match Analysis: Match results to output for your FTP server. Select the object or sub-object to log. All Veeva standard objects and sub-objects and custom objects and sub-objects are supported. Do not use this option if you are loading more than 100,000 records. For more information, see Match analysis log and Sub-object match logs.
  • Match Analysis Advice Filters: Filter exported match log by result type: ACT, ASK, or Unmatched results (no matches). Selecting the type of match records enables you to focus on the match results that you care about, and reduces the size of the file. All options are enabled by default.

Tip: By default, opted-out HCP records are excluded from matching. To match incoming records with these records, see Opt out matching settings.

Provide source files

In the Source Files section, provide an FTP path to a source file; for example SRC_System/SRC_File. Note that when you specify a folder name you must append an asterisk (*) to include all files in that folder, for example, SRC_System/*.

Use a subfolder to contain incoming files (and separate inbound file sets). The loader examines the FTP Path folder and one level of subfolders to find files to process. Consider the hierarchy you use carefully so that file sets are easily identifiable.

Using this practice, you maintain a complete record of all data loaded, and previously loaded file sets are not overwritten on subsequent loads.

The FTP path that you identify is not validated, so ensure that it is correct so that your source subscription job does not fail.

Note: Source subscriptions also support import of .csv files compressed using the gzip (.gz) compression format.

For information about accessing the files in your Network file system, see File Explorer.

File limit

A maximum of 128 files, or feeds, can be processed in each job. If more than 128 files are found in the specified FTP folder, the job will fail and a Data loading configuration error message displays on the page.

Source subscriptions look for unprocessed files, or if you have the Allow File Reprocessing option selected, all of the files in the specified FTP path can be considered. Each file starts an independent job, so only a limited number of jobs can run to avoid issues. Users are responsible for managing the files in their FTP folders.

User access to FTP directories

When you define the FTP Path for each subscription, you can use directories that users only with explicit need can access. In user profiles, administrators can identify specific FTP folders that each user can access. This protects your data in other FTP directories. For more information, see Add a user profile.

Add file definitions

To add source files, click the Add File link and provide the following in the File Definitions section:

  • File Name: Provide the name of a source file. The name should contain only numbers, letters, and underscores (_) and should not include the file extension; for example, .csv.
  • Alias: The simplified name of the file that is used to reference it in the model map code. The alias should be a simple term using only letters, numbers, and underscores (_).

    Example

    If you include a file in the feed called hcp_DAT_vids_20190305.csv, create an alias called HCPdata with a value of hcpdata* so you can refer to the simplified name in your model map.

    Tip: Avoid using file aliases that match object names; for example, HCO, because it can be confusing in the model map. use a unique name that will not be misunderstood; for example, XHCO.

  • Key Columns: Type keys from the external system, delimited by commas. This should include columns used for keys, anchors, and joins.

  • Text Qualifier: Select a qualifier to use for the beginning and end of text.
  • Format: Select Delimited or Fixed Length to indicate the file format.
  • Delimiter: Identify how the data in the file is separated.
  • Header Row?: Select the Header Row checkbox to indicate that a comma separated list of headers exists for the header row.

To add more files for processing, click the Add File link again.

Source file column headers

Column headers for source subscriptions are case-sensitive which can cause data loading issues. To format the column headers that are loaded to use either lowercase (default) or uppercase letters, add the following property to Advanced Mode:

"parser.attribute.rules": "network:force_lowercase_attributes"

or

"parser.attribute.rules": "network:force_uppercase_attributes"

When the subscription runs and the files are loaded, the column header values will be formatted to lowercase or uppercase letters.

Considerations for attributes

If you reference column headers in the source subscription configuration (for example, in Network Expression Rules, Model Map, Field Normalization, or Transformation Queries sections), manually edit those references so they use the same case as the column headers in the file (uppercase or lowercase).

Avoiding special characters in source files

Because some characters are used as parser values for processing the source file, they should not be used in source files. Avoid using the following characters:

  • comma (,)
  • quote (")
  • pipe (|) - This is often reconfigured to a comma (,) using the advanced subscription property "parser.csv.separator" : ",".

Create a model map

The model map defines the objects and sub-objects and how the data is provided in the incoming files, along with instructions on how to use them. The following samples illustrate model maps for two different sources.

Note: Unless they are anchored from an address file, licenses should be modeled against the HCP or HCO.

Examples

Define Network Expression Rules

Use rule expressions (NEX rules) to enforce data quality standards when you process externally sourced data through source subscriptions. Configure the rule expressions using a set of functions and operators to define specific rules for a particular stage of the data load.

Example

This NEX rule uses the CLEANSE_STRING function to cleanse and standardize the incoming data for the Company Name field on the Company object. The rule is applied to the Transformation stage so the data is cleansed before matching and merging begins in the job.

To add a NEX rule:

  1. Click Add Rule.

  2. Expand the Rule Point and choose the stage of the job that the rule should be applied.

  3. Expand File / Entity to select the object.

  4. Create the NEX rule here or in the NEX Tester feature (System Interfaces > NEX Tester).

    The NEX Tester includes syntax highlighting and formatting to help you to easily create the rule. You can test the rule using a valid Veeva ID and then copy the expression and paste it directly in this source subscription configuration. For details, see NEX tester.

    NEX rules must be created using specific formats in order to be valid.

  5. Click Verify. If the rule syntax is valid, the message NEX is valid displays.

Multiple rules can be created.

For detailed information about the data loading stages and valid formats for NEX rules, see Using rule expressions.

Define normalization rules

The normalization rules define field mappings from columns in an incoming source file to the attributes in the Network data model.

These mappings are divided into arrays, which correspond to entities in the Network external data model (HCP, HCO, ADDRESS, LICENSE, PARENTHCO, and CUSTOMKEYS).

Considerations

  • Field normalizations are case-sensitive. For example, if you configure the field normalization "HCP.city": "locality__v"," city is case sensitive. The column name in the UI must match exactly.
  • If aliases are used in the model map (* AS *), do not prefix the entity in the field normalization. For example, if "HCP.* AS *" is defined in the model map, the field normalization for HCP should be "LASTNAME" instead of "HCP.LASTNAME". This does not apply to the VDM attributes (for example, "HCP.VDM_* AS *").

Examples

The following are examples of properties for two sources, in JSON format.

Define match rules

Network includes a pre-built set of match rules for each country that has been tested and tuned. These match rules should be the starting point for any additional configuration and are included by default in each new source subscription.

The regular match process loads external data into Network and manages incoming records as follows:

Define the match configuration

The Match Configuration section contains two options:

  • Use Match Configuration - Define specific match rules by country for this source subscription. Default for new subscriptions.

    The Basic and Advanced tabs display after the Country Group is defined.

    If this option is selected and match rules are not defined, the default match configuration is used.

  • Use Match Rule Collection - Use an existing match rule collection.

    Match collections can be created and applied to multiple subscriptions that have common match configurations. Using a match rule collection simplifies match configurations.

    Match rule collections are created in System Interfaces > Match Rule Collections. For more information, see Match rule collections.

Note: Match rules can be configured for custom objects using the Advanced mode only.

Match configurations contain the following tabs:

  • Data Groups - Restrict the set of records to compare.

    Data groups should be generic enough to include all possible matches, but cannot include more than 1,000 records.

  • Match rules- Provides detailed rules to compare records.

  • Match filters - Conditions that apply to all defined match rules to include or exclude specific records from being considered for match pairs.

  • Ranked filter groups - Match conditions that are applied to the subscription to include or exclude records that are considered for matching.

For detailed information, refer to Match configuration.

Applying filters to incoming records

If filters are configured for individual match rules or all match rules, the rules can consider incoming records even if they are missing the field value of the filter.

  • Apply match rules whether incoming records have a value in a filtered field or not - Select to allow incoming records with empty or missing field values to be considered for match rules.
  • Apply match rules only if incoming records have a value in a filtered field - Select if the incoming data is robust and has values in filtered fields

    This is the default for match rules since it is existing behavior.

These options are supported when the filter function is Include. Exclude functions require the records to have the specified field and value.

Add transformation queries

Define the transformation queries to pre-process data before it is loaded by the source subscription. For example, if a source system provides a single file containing many entities, use transformation queries to split the data into individual files so the model map can be simplified.

Transformation queries are SQL based (SELECT statements). They should be thoroughly tested before they are added to the source subscription.

To apply transformation queries:

  • Click Add Query to add a transformation query to the subscription. Only inbound queries display in the list.

You can apply multiple queries to each subscription. The queries will always run in sequence as they are listed in the subscription. Change the order of the queries using the Handle icon.

Schedule subscriptions

In the Job Schedule & Triggers section, define the schedule for the job and any subsequent actions that will start when this job finishes.

Job Schedule

You can run subscriptions manually or on a scheduled basis. If you select Manual, the subscription only runs when you click the Start Job button on the subscription page.

The Schedule section defines the schedule settings for the subscription, based on the timezone in your user profile settings. For more information, see Job schedules.

You can schedule the subscription for the following intervals:

  • Hour: Runs the subscription every hour. Specify the number of minutes past the hour.

  • Day: Runs the subscription every day. Specify the time of day (hour and minutes).

  • Week: Runs the subscription every week. Specify the days of the week and time of day (hours and minutes).

  • Month: Runs the subscription monthly. Specify the day of the month and time of day (hours and minutes). Note that if you specify a day value that doesn’t exist for all months (for example, 31), the schedule will not run for those months.

  • Year: Runs the subscription every year. Specify the month, the day of the month, and time of day (hours and minutes). As with a monthly schedule, ensure that the day you choose falls within the month you choose.

To define another schedule for the job to run, click Add Schedule.

Tip: To run a job at month end, select Month and 1st as the day.

Considerations for defining multiple schedules

If jobs are scheduled to run close together, the following situations might occur:

  • If there are no changes, the job does not run.
  • If a job has not finished, and a second job is scheduled to begin, the second job will not run.

Job Triggers

You can choose to trigger other actions to start after a job runs.

Available triggers:

  • Send email - Specify users that should be notified for successful and unsuccessful job outcomes.
  • Start a job - Start a subsequent job when this job successfully completes.

    Important: Source subscriptions that are run in Test Mode or Simulation Mode cannot trigger another job. Ensure that the source subscription configuration is set to Save Changes to the Database if you want it to trigger another job to start.

For more information, see Subscription job triggers.

Save subscriptions

When you have completed the source subscription configuration, save your changes.

For all new subscriptions, a dialog displays that requires answers to questions about the origin of the data.

Select either Yes or No for each question.

  • Does your company purchase data from a third party other than Veeva (for example, data from IQVIA)?
    • If you answer No to this question, click Confirm.
    • If you answer Yes to this question, another confirmation is required.

      Important: The TPA response is logged for audit purposes. Ensure that the correct third-party system is selected in the System field.

Select the appropriate response:

  • I have confirmed that a third party access agreement (TPA) is in place that allows Veeva to receive this data. (Note: this is required for any IQVIA data.)

  • I have confirmed that my company's agreement with the applicable third-party data supplier does not require a TPA in order for Veeva to receive this data. (Note: this is not applicable to IQVIA data.)

The answers to the questions are auditable and are logged in the System Audit History. For information, see Confirming third-party agreements.

Data Flow View

After you save a source subscription, a new Data Flow View is added to the Details section to visualize all of the steps of the job.

Click the Data Flow View thumbnail to open the view.

These stages of the job are defined:

  • Start - Indicates if the job has been triggered by another job.

  • Files in Inbound Folder - The file path of the inbound folder. Click the link to open the File Explorer in a new tab.

  • Apply Transformation Queries - Details about each query that is applied to the subscription. If there are multiple queries, they are listed in the order that they run. Click the query name to open the transformation query configuration. Click View Query to see the query. This step does not display if transformation queries are not applied.

  • Import Files - The file definitions for each file that is loaded. The Import Options section defines the options that are set in the source subscription configuration.

  • End - Indicates if this job triggers email notification or another job.

To return to the job configuration, click Back to Subscription Page.

Unsaved changes

If you have made changes to the source subscription configuration, the Data Flow View does not reflect those changes until the subscription is saved. A message displays if you open the view before saving the subscription.

Run subscription jobs

After a source subscription runs, the details of the job can be viewed by clicking the link in the Last Job Status column on the Source Subscriptions page. For detailed information, see Viewing source subscription job details.

Source Subscriptions view

After a subscription is configured, it displays on the Source Subscriptions page (System Interfaces).

The details identify the name, source system and schedule for each subscription. There are also details about the current configuration:

  • Match Configuration - Identifies if the subscription uses the defined match configuration or uses a match rule collection.

    The match rule collection name displays as a link. Click the link to open the match rule collection configuration for more details.

  • Last Job Status - The status of the job that most recently ran. To view the details, click the link in the column. For detailed information, see Viewing source subscription job details.

  • Save Changes - Indicates if the data was saved to the Network database.

    • Enabled - The subscription is configured to commit changes to the database.

    • Disabled - The subscription is configured to run in test mode. No changes will be committed to the database.

  • Simulation Mode - Indicates if the job was run to test the data and simulate the updates.

    • Enabled - The subscription is configured to simulate the job output. No changes will be committed to the database.

    • Disabled - The subscription is configured to run in test mode. No changes will be committed to the database.

  • Status - The subscription state (On or Off).