Implementation of Matching Tree Technique for Online Record Linkage (original) (raw)

The task of finding records refering to the same entity across heterogeneous data sources is known as record linkage .the necessity to consolidate the information located in different data sources has been widely documented in recent years. For the purpose of completing this goal, several types of problems ,must be solved by an organization .when same real world substance is used by different identifiers in different sources entity heterogeneity problem will arise. For solving the entity heterogeneity problem statistical record linkage technique could be used. However, the use of such techniques for online record linkage could pose a tremendous communication bottleneck in a distributed environment (where entity heterogeneity problem often encountered). In order to resolve this issue, we develop a matching tree, similar to a decision tree, and use it to propose techniques that reduce the communication overhead significantly, while providing matching decisions that are guaranteed to be the same as those obtained using the conventional linkage technique. These techniques have been implemented, and experiments with real-world and synthetic databases show significant reduction in communication overhead.

Sign up for access to the world's latest research.

checkGet notified about relevant papers

checkSave papers to use in your research

checkJoin the discussion with peers

checkTrack your impact