Appendix: 31 Common Field Combinations for Duplicate Detection
- Single-field tests (5 combinations).
- Combination 1: Find records with the same personal name. This test is too loose to provide useful results.
- Combination 2: Find records with the same company name. Will match all employees at a company, regardless of location. Sometimes useful when creating lists that don't have more than one record for each business name.
- Combination 3: Find records with the same address. This test is too loose to provide useful results, since the same address can occur in different cities.
- Combination 4: Find records with the same city. This test is too loose to provide useful results.
- Combination 5: Find records with the same ZIP®. This test is too loose to provide useful results.
- Two-field tests (10 combinations).
- Combination 6: Find records with the same personal name and company name. Will match individual employees at a company, regardless of location (for example, an employee with a duplicate address record, or a separate mailing and shipping address).
- Combination 7: Find records with the same personal name and address. Will find duplicates, regardless of errors in the city-state-ZIP (which should be rare in an address-standardized file).
- Combination 8: Find records with the same personal name and city. Will find individuals with two different addresses in the same city.
- Combination 9: Find records with the same personal name and ZIP. Slightly more restrictive than Combination 8, if the city has multiple ZIP codes.
- Combination 10: Find records with the same company name and address. Will match all employees in a company at one location.
- Combination 11: Find records with the same company name and city. Will find all employees and addresses for a company in one city.
- Combination 12: Find records with the same company name and ZIP. Slightly more restrictive than Combination 11, if the city has multiple ZIP codes.
- Combination 13: Find records with the same address and city. Matches all individuals sharing an address, regardless of their name. Will return all employees in all companies at a business address, and all household occupants at a residential address.
- Combination 14: Find records with the same address and ZIP. Slightly more restrictive than Combination 13, if the city has multiple ZIP codes.
- Combination 15: Find records with the same city and ZIP. This test is too loose to provide useful results.
- Three-field tests (10 combinations).
- Combination 16: Find records with the same personal name and company name and address. Finds all duplicate employee records at each address, regardless of errors in the city-state-ZIP (which should be rare in an address-standardized file).
- Combination 17: Find records with the same personal name and company name and city. Will match individual employees at a company in a city, regardless of address (for example, an employee with a duplicate address record, or a separate mailing and shipping address).
- Combination 18: Find records with the same personal name and company name and ZIP. Slightly more restrictive than Combination 17, if the city has multiple ZIP codes.
- Combination 19: Find records with the same personal name and address and city. Will find duplicates, regardless of ZIP Code™ errors (which should be rare in an address-standardized file).
- Combination 20: Find records with the same personal name and address and ZIP. Will find duplicates, regardless of city errors (which should be rare in an address-standardized file).
- Combination 21: Find records with the same personal name and city and ZIP. Will find multiple addresses for the same person in one city. Using city and ZIP is redundant in an address-standardized file.
- Combination 22: Find records with the same company name and address and city. Will match all employees in a company at one location.
- Combination 23: Find records with the same company name and address and ZIP. Identical to Combination 22 in an address-standardized file.
- Combination 24: Find records with the same company name and city and ZIP. Will match all employees at a company in a city, regardless of address. Using city and ZIP is redundant in an address-standardized file.
- Combination 25: Find records with the same address and city and ZIP. Will find all individuals at an address, regardless of their name. Using city and ZIP is redundant in an address-standardized file.
- Four-field tests (5 combinations).
- Combination 26: Find records with the same personal name and company name and address and city. Will find duplicates, regardless of ZIP errors (which should be rare in an address-standardized file).
- Combination 27: Find records with the same personal name and company name and address and ZIP. Will find duplicates, regardless of city errors (which should be rare in an address-standardized file).
- Combination 28: Find records with the same personal name and company name and city and ZIP. Will match individual employees at a company in a city, regardless of address (for example, an employee with a duplicate address record, or a separate mailing and shipping address). Using city and ZIP is redundant in an address-standardized file.
- Combination 29: Find records with the same personal name and address and city and ZIP. Will find duplicate records for the same person at one address. Using city and ZIP is redundant in an address-standardized file.
- Combination 30: Find records with the same company name and address and city and ZIP. Will match all employees in a company at one location. Using city and ZIP is redundant in an address-standardized file.
- Five-field test (one combination).
- Combination 31: Find records with the same personal name and company name and address and city and ZIP. Will find duplicate records for individual employees at a company at one address. Using city and ZIP is redundant in an address-standardized file.
Copyright © 1996-2007 by Semaphore Corporation
Email help@semaphorecorp.com
Semaphore Corporation is a non-exclusive licensee of the
United States Postal Service®.
The prices of Semaphore Corporation products
are not established, controlled or approved by the Postal Service™.
The following trademarks are owned by the United States Postal Service:
ZIP,
ZIP Code,
Postal Service,
and United States Postal Service. [DA#4.07]