The intention of this blog is to provide directions for tuning match rules in Informatica MDM.
Steps for Match Rules Setup and Tuning
Data analysis can be performed only for the use of the match process or as part of the overall data analysis. Data audit steps include:
Get Data : It is significant to acquire real customer data from as many sources as possible.
Discuss With Users : Meet with business users to settle on high level match requirements and rules. At this stage these rules are in the language of the business. Distinct source and golden sources need to be defined.
Data Investigation : The main advantage of data investigation is to determine the fields that will contribute to the match process and to identify the match key.
Group Summaries : Run a simple analysis on data to identify large groups of identical records. Determine whether or not to use exact match rules. Find good candidates for filters.
Data Quality Assessment : Determine the general quality of the data. Check for the following:
Data Completeness : Determine the degree to which a data column is complete.
Data Accuracy: Ensure that data is accurate. Confirm that fields contain the correct values.
Suspect Data: Look for irrelevant data and remove these if necessary.
Determine Population: Determine the type of match population to use.
Use results from the data audit step to set up cleanse functions to standardize the data. Use an address cleanse tool to standardize and clean the address information.
Define Match Key
Set up the match key based on discussions with business users and the data audit. As a general rule, use the following as match keys:
Organization name - If data contains organization names or both organization and individual names.
Person name - If data contains individual names only.
Address part1 - If the data contains addresses only.
Set Up Match Rules
Set up draft match rules based on discussions with business users and the data audit.
Define the Match levels, Search levels and Match purposes.
Once the match rule is created, run SQL commands to provide an estimate of the possible match candidates.
Name and Address Match Dry Run
It is suitable to run the name and address fuzzy matches before using match rules. This enables to identify the match levels based on name and address exclusively. Once the appropriate match levels are determined, then include the unique identifiers to further qualify the matches.
Review Name and Address Match Results
Review the match results from name and address match run.
Run a query on your MTCH table - group by match rule. These results give an initial view of where the match falls and which match levels to use.
Review detailed match results by querying the MTCH table.
Adjust the rules if necessary and run again.
Make a copy of the MTCH table.
Set Up Match Rules with Exact Columns
Once the composition of the name and address match rules is done, define the final match rules. It is advised to add the unique identifiers at this point. Include an exact column in the match rules.
Review Match Results
Repeat the Review Name and Address Match Results tasks.