Configuring null matching in features


The null matching configuration for a collation defines how match outcomes are determined when null values exist. By default, match ignores null values.
Configuration can use any of four options to indicate the result when records are compared. This is important since null values can exist in a field in either or both of the records being compared.
Field 1 | Field 2 | Do not match | Neutral | Match | Match only if both are null |
null | null | REFUTE | ABSTAIN | SUPPORT | SUPPORT |
null | value | REFUTE | ABSTAIN | SUPPORT |
REFUTE |
value | null | REFUTE | ABSTAIN | SUPPORT |
REFUTE |
value | value | compare | compare | compare | compare |
Configure null value options
When matching on a name, the middle name field is not always populated, so null matching set to Do not match would result in the name matching rule failing if one record had a middle name and the other did not.
Using null matching set to Neutral (which is the default) allows the rule to compare middle names only if they exist in both records. If this field is blank in either record, the middle name portion of the feature would return a value of ABSTAIN and won’t make the match process fail. This is a good option if there are multiple features in the feature set.
When there is only one feature in the feature set (for example, NPI), use null matching set to Match. This ensures that null values match anything, so the feature returns a value of SUPPORT and the match does not fail.
Example Feature configuration
The Null value options are different between the Match UI and Advanced Mode.
Match UI | Advanced Mode |
---|---|
Do not match | STRICT |
Match | WILDCARD |
Match only if both are null | VALUE |
Neutral | IGNORE |
Advanced Mode
This definition looks like this in Advanced Mode.
<feature> <name>full names are similar</name> <enabled>true</enabled> <comments></comments> <collate> <direct> <field>first_name__v</field> <field>last_name__v</field> <nullMatching>STRICT</nullMatching> <jaroWinklerComparison> <usingWinklerExtention>false</usingWinklerExtention> <usingLargeStringTolerance>false</usingLargeStringTolerance> <threshold>0.80</threshold> </jaroWinklerComparison> </direct> <!-- look at middle name separately so its absence doesn't throw off the results --> <direct> <field>middle_name__v</field> <nullMatching>IGNORE</nullMatching> <jaroWinklerComparison> <usingWinklerExtention>true</usingWinklerExtention> <usingLargeStringTolerance>false</usingLargeStringTolerance> <threshold>0.80</threshold> </jaroWinklerComparison> </direct> </collate> </feature>
Null matching outcomes
The following table illustrates matching options and outcomes between fields where at least one of the values is null. Outcomes are one of the following:
- Match - "Yes" vote within the feature (denoted by a check mark)
- No match - "No" vote within the feature (denoted by an x)
- No impact on the final match outcome for the complete feature (denoted by a -).
Record 1 | Record 2 | Match (wildcard) |
Do not match (strict) |
Neutral (ignore) |
Only match if both are null (value) |
Null | Null |
|
|
- |
|
Value | Null |
|
|
- |
|
Null | Value |
|
|
- |
|
The following examples illustrate null matching outcomes where the match feature includes a first name, middle name, and last name.
Example 1
In this example, the complete feature fails the match rules because at least one component of the feature did not match.
In this case, the feature failed because the middle name did not match. Basing a match configuration on the middle name does not follow best practice; a less weighted field like this should use Match or Neutral. For middle name in particular, Match should be used when the match rule also matches only on first initial, using the substring option.
Field name | Record 1 | Record 2 | Result | Outcome by field |
First Name | John | Null |
- |
The match rule used by the first name field will not have a vote in the overall outcome of this feature |
Middle Name | Michael | Null |
|
The match rule used by the middle name field will vote "No," meaning the middle name fields for these records do not match. |
Last Name | Null | Smith |
|
The match rule used by the last name field will vote "Yes," meaning the last name fields for these records match. |
Example 2
In this example, the complete feature passes the match rules because none of the components did not match.
Classifying these records as a match does not follow best practices and should not be used as a match configuration. The last name, and to a lesser degree the first name fields, should both have values to be considered a match. In this example, record "John Michael <no last name>" is being considered a match to "<no first name> Michael Smith." Name fields are very important and should always contain values for records to be considered a match.
Field name | Record 1 | Record 2 | Result | Outcome by field |
First Name | John | Null |
|
For this record, the match rule used by the first name field will vote "Yes," meaning the first name fields for these records match. |
Middle Name | Michael | Null |
- |
For this record, the match rule used by the middle name field will not have a vote in the overall outcome of this feature. |
Last Name | Null | Smith |
|
For this record, the match rule used by the last name field will vote "Yes," meaning the last name fields for these records match. |
Example 3
In this example, the complete feature fails the match rules because at least one component of the feature did not match.
Failing a match because the last name did not match while the other name fields did match is reasonable. However, be aware that another pair of records could have the last name field populated but these settings would consider "John Michael Smith" a confident match to "<no first name> Michael Smith." The first name field is an important one, and both first name and last name fields should contain values to be considered a confident match.
Field name | Record 1 | Record 2 | Result | Outcome by field |
First Name | John | Null |
|
For this record, the match rule used by the first name field will vote "Yes," meaning the first name fields for these records match. |
Middle Name | Michael | Null |
- |
For this record, the match rule used by the middle name field will not have a vote in the overall outcome of this feature. |
Last Name | Null | Smith |
|
For this record, the match rule used by the middle name field will vote "No," meaning the middle name fields for these records do not match. |
Example 4
In this example, the complete feature fails the match rules because at least one component of the feature did not match.
Failing a match because first name and last name did not match due to null values is best practice. These are key fields that should always have values to be considered a confident match.
Field name | Record 1 | Record 2 | Result | Outcome by field |
First Name | John | Null |
|
For this record, the match rule used by the middle name field will vote "No," meaning the middle name fields for these records do not match. |
Middle Name | Michael | Null |
- |
For this record, the match rule used by the middle name field will not have a vote in the overall outcome of this feature. |
Last Name | Null | Smith |
|
For this record, the match rule used by the middle name field will vote "No," meaning the middle name fields for these records do not match. |