Deduplication
Deduplication is an art form. However, where possible we apply as much science to the process of identifying duplicates as possible. To this end we have adapted the Guth-deBrou-Olson algorithm as a more effective option to traditional Soundex based routines. Combined with phonetic matching, limited fuzzy logic and a uniquely adaptable rule based option, we have attempted to emulate human methods of identifying matches, giving you the best results possible.
The fundamental logic of the Cygnus dedupe cannot be altered. However, we recognise that data-sets vary in their composition and that the needs and criteria of a dedupe vary too.
To this end we have adopted a quasi-neural option which enables users to vary match rules within the dedupe to cater for specific requirements.
Effectively the process can be "trained" to optimise results. Users are increasingly prepared to enhance their dedupe on a record by record basis. Cygnus' dedupe module enables users to browse dupe groups and close match groups and, with the click of a button, to promote or demote records from each group. For large data files, filters can be set to exclude percentages of dupes, maximising the efficiency and effect of the dedupe.
Reports are an essential element to fully understanding the effect of your dedupe and suppression processes. Cygnus automatically generates:
- Processing/Match Matrix
- List Performance Matrix
- Match and Mis-match groups
- Samples of dupes and suppressions
- Samples of close match records
- Net names reports.
A key element is the ability to re-run a dedupe in ultra quick time - usually less than 10% of the original run time, whether you've decided to amend hierarchy or simply change the dedupe criteria from surname level to address level.
MPS, Mortascreen, TBR, GAS, USS and other standard industry suppression files (including B2B) are available within Cygnus on payment of the appropriate licence fees to the data owners or via the ES module.
Find out more and arrange a demo >>

