Configuring null matching in features

AD
DM

The null matching configuration for a collation defines how match outcomes are determined when null values exist. By default, match ignores null values.

Configuration can use any of four options to indicate the result when records are compared. This is important since null values can exist in a field in either or both of the records being compared.

Field 1 Field 2 Do not match Neutral Match Match only if both are null
null null REFUTE ABSTAIN SUPPORT SUPPORT
null value REFUTE ABSTAIN SUPPORT

REFUTE

value null REFUTE ABSTAIN SUPPORT

REFUTE

value value compare compare compare compare

Configure null value options

When matching on a name, the middle name field is not always populated, so null matching set to Do not match would result in the name matching rule failing if one record had a middle name and the other did not.

Using null matching set to Neutral (which is the default) allows the rule to compare middle names only if they exist in both records. If this field is blank in either record, the middle name portion of the feature would return a value of ABSTAIN and won’t make the match process fail. This is a good option if there are multiple features in the feature set.

When there is only one feature in the feature set (for example, NPI), use null matching set to Match. This ensures that null values match anything, so the feature returns a value of SUPPORT and the match does not fail.

Example Feature configuration

The Null value options are different between the Match UI and Advanced Mode.

Match UI Advanced Mode
Do not match STRICT
Match WILDCARD
Match only if both are null VALUE
Neutral IGNORE

Advanced Mode

This definition looks like this in Advanced Mode.

<feature>
    <name>full names are similar</name>
    <enabled>true</enabled>
    <comments></comments>
    <collate>
        <direct>
            <field>first_name__v</field>
            <field>last_name__v</field>
            <nullMatching>STRICT</nullMatching>
            <jaroWinklerComparison>
                <usingWinklerExtention>false</usingWinklerExtention>
                <usingLargeStringTolerance>false</usingLargeStringTolerance>
                <threshold>0.80</threshold>
            </jaroWinklerComparison>
        </direct>
        <!-- look at middle name separately so its absence doesn't throw off the results -->
        <direct>
            <field>middle_name__v</field>
            <nullMatching>IGNORE</nullMatching>
            <jaroWinklerComparison>
                <usingWinklerExtention>true</usingWinklerExtention>
                <usingLargeStringTolerance>false</usingLargeStringTolerance>
                <threshold>0.80</threshold>
            </jaroWinklerComparison>
        </direct>
    </collate>
</feature>

Null matching outcomes

The following table illustrates matching options and outcomes between fields where at least one of the values is null. Outcomes are one of the following:

  • Match - "Yes" vote within the feature (denoted by a check mark)
  • No match - "No" vote within the feature (denoted by an x)
  • No impact on the final match outcome for the complete feature (denoted by a -).
Record 1 Record 2 Match
(wildcard)
Do not match
(strict)
Neutral
(ignore)
Only match if both are null
(value)
Null Null

-

Value Null

-

Null Value

-

The following examples illustrate null matching outcomes where the match feature includes a first name, middle name, and last name.

Example 1

In this example, the complete feature fails the match rules because at least one component of the feature did not match.

In this case, the feature failed because the middle name did not match. Basing a match configuration on the middle name does not follow best practice; a less weighted field like this should use Match or Neutral. For middle name in particular, Match should be used when the match rule also matches only on first initial, using the substring option.

Field name Record 1 Record 2 Result Outcome by field
First Name John Null

-

The match rule used by the first name field will not have a vote in the overall outcome of this feature
Middle Name Michael Null
The match rule used by the middle name field will vote "No," meaning the middle name fields for these records do not match.
Last Name Null Smith
The match rule used by the last name field will vote "Yes," meaning the last name fields for these records match.

Example 2

In this example, the complete feature passes the match rules because none of the components did not match.

Classifying these records as a match does not follow best practices and should not be used as a match configuration. The last name, and to a lesser degree the first name fields, should both have values to be considered a match. In this example, record "John Michael <no last name>" is being considered a match to "<no first name> Michael Smith." Name fields are very important and should always contain values for records to be considered a match.

Field name Record 1 Record 2 Result Outcome by field
First Name John Null
For this record, the match rule used by the first name field will vote "Yes," meaning the first name fields for these records match.
Middle Name Michael Null

-

For this record, the match rule used by the middle name field will not have a vote in the overall outcome of this feature.
Last Name Null Smith
For this record, the match rule used by the last name field will vote "Yes," meaning the last name fields for these records match.

Example 3

In this example, the complete feature fails the match rules because at least one component of the feature did not match.

Failing a match because the last name did not match while the other name fields did match is reasonable. However, be aware that another pair of records could have the last name field populated but these settings would consider "John Michael Smith" a confident match to "<no first name> Michael Smith." The first name field is an important one, and both first name and last name fields should contain values to be considered a confident match.

Field name Record 1 Record 2 Result Outcome by field
First Name John Null
For this record, the match rule used by the first name field will vote "Yes," meaning the first name fields for these records match.
Middle Name Michael Null

-

For this record, the match rule used by the middle name field will not have a vote in the overall outcome of this feature.
Last Name Null Smith
For this record, the match rule used by the middle name field will vote "No," meaning the middle name fields for these records do not match.

Example 4

In this example, the complete feature fails the match rules because at least one component of the feature did not match.

Failing a match because first name and last name did not match due to null values is best practice. These are key fields that should always have values to be considered a confident match.

Field name Record 1 Record 2 Result Outcome by field
First Name John Null
For this record, the match rule used by the middle name field will vote "No," meaning the middle name fields for these records do not match.
Middle Name Michael Null

-

For this record, the match rule used by the middle name field will not have a vote in the overall outcome of this feature.
Last Name Null Smith
For this record, the match rule used by the middle name field will vote "No," meaning the middle name fields for these records do not match.