Mass De-Duplication using CiviCRM

Parvez Saleh

Leukaemia & Lymphoma Research have been live with CiviCRM since 2011. Duplicates were an issue with the data brought into CiviCRM and web enabling the database has meant that the problem needs to be resolved in order to ensure accurate reporting of supporter trends.

Initial investigations using core functionality caused the server to fall over . Over the past months Leukaemia & Lymphoma Research have been working with Veda NFP Consulting to write some custom rules which are more efficient and the ability to use these from within the UI. We’re being fairly strict thus far, rather than fuzzy.

What we've produced is the ability for the user to use a combination of De-Dupe rules that will determine whether contacts are duplicates and will try to merge based on some merge rules.

Examples of merge rules include;

  • Keep the contact with the greatest number of contributions as the master record
  • Do not merge records with a Drupal account each
  • Do not merge contacts with conflicting comms preferences
  • Keep the newest address
  • Keep the newest email

In this session we'll go through the setup of the extension and demonstrate how it can be used to help manage duplciates for large databases.


Mass de-duplication with CiviCRM, exploring an extension that aims to tackle common issues....
