Creating data groups

AD
DM

Administrators define data groups from within a source subscription, ad hoc match configuration, or default match configuration.

Best practice

Use the following guidelines to build effective data groups.

  • The maximum size of each group should not exceed 1000 entries. Matching stops at the 1000th entry, so data groups that have more than 1000 entries will cause obvious matches to be ignored.
  • Use only as many data groups, or information in each data group, as you need. The greater the number of data groups or the larger each group, the longer the match process takes. However, you must ensure that you capture most or all possible combinations of the data in the incoming file to ensure matches are not overlooked.
  • Fields used in data groups must exist in the incoming file.
  • Configure data groups precisely. Data grouping uses exact matches only; it does not consider fuzzy matches. For example, if one of the data group definitions groups on corporate name alone, you won’t be able to compare HCOs with similar names in that particular group because each group contain one unique corporate name.

Create a data group

  1. To the right of the data group list header, click the Entity drop-down list and select HCP or HCO to work with.
  2. Click the + Add Data Group link at the bottom of the list (or the + button at the top right of the list) to add more items to the group.
  3. Type a descriptive name for the new group.
  4. Type fields to include in the group. As you type, auto-complete options appear.
  5. Click the Allow null fields field and select fields in the group for which empty fields are allowed. (Fields in the data group that are not listed here will not allow null field values.) Null value options must be applied to all fields on an entity or object.
  6. Click the Enabled checkbox to include this data group in the matching process.
  7. Include optional descriptive comments for the data group and click Done to add the new data group.
  8. Click Save at the top of the page to save these changes to the subscription.

Note: The Allow null fields option will not allow null blocks to be created. If an incoming record has no values in the field or fields specified in an individual data group, a block will not be created for that particular definition. If the incoming record does not have values in all of the fields specified in the entire data group definition, that record will fail during the load and an error message will be displayed in the job details.

Create data groups in Advanced mode

In Advanced mode, you would define the same data groups as follows.

You can only create one data group definition, but it can contain any number of data groups. Each of the HCP and HCO entities must have separate data group definitions. Data group definitions typically contain the following elements:

  • <staticBlocker> and </staticBlocker> - These elements must be included as the first and last lines respectively.
  • <archetype> and </archetype> - These elements surround each data group within the overall data group definition.
  • <field> and </field> - These elements surround field names used for a data group; for example, <field>corporate_name__v</field>. Data groups can contain multiple fields, each listed within separate <field> elements.
  • <child> and </child> - These elements must surround fields that belong to sub-objects; for example, Licenses or Addresses. Within these elements, you must include the following:
    • <base> and </base> - These elements indicate the sub-object (Licenses or Addresses) that the sub-object fields belong to; for example, <base>addresses__v</base> or <base>licenses__v</base>.
    • <field> and </field> - A field name for the sub-object; for example, <field>locality__v</field>.
  • <allowNullFields> and </allowNullFields> - This element specifies a value to indicate whether null values will be considered during the match process. If false, data groups are only created for records with values in all specified fields. If true, data groups are created even if one or more fields have null values. The default is true.

The following sample data group definition groups data into blocks containing the following unique combinations:

  • first name (first_name__v), street (thoroughfare__v), and city (locality__v)
  • last name (last_name__v), street (thoroughfare__v), and city (locality__v)
<staticBlocker>
  <archetype>
     <field>first_name__v</field>
     <child>
        <base>addresses__v</base>
        <field>thoroughfare__v</field>
        <field>locality__v</field>
     </child>
  </archetype>
  <archetype>
     <field>last_name__v</field>
     <child>
        <base>addresses__v</base>
        <field>thoroughfare__v</field>
        <field>locality__v</field>
     </child>
     <allowNullFields>false</allowNullFields>
  </archetype>
</staticBlocker>

In the first data group, blocks are created for every combination of first name, street, and city in the data. In the second data group, blocks are created for every combination of last name, street, and city. Subsequently, the number of blocks created by a data group definition can be very large.