Start With Clean Data

how to ensure you have data integrity

There is no sense in doing data-analysis without first cleaning and scanning your data for accuracy and completeness. Skip this part and you will compromise the integrity of your analysis and negatively impact your results.

Here are the 'basic-requirements' for a data audit. 

  • Consolidation: Are there other sources of data that need to be included?
  • Cleanse: Does the data contain inaccurate entries?
  • Dedup: Does the list contain redundancies
  • Does the data contain 'special' records that need to be flagged or removed?

The data cleaning workflow

Each of these issues present a challenge that must be dealt with in a progressive sequence because they have implications on data accuracy. The order of the list above therefore is not accidental, but rather points to a need to create a progressive workflow to ensure best possible results.

Consolidate your data first

When doing a comprehensive data audit or analysis, you want to take inventory of all the possible data that will enhance your analysis. However, if your project is limited, then you want to limit it to include only the relevant data sources and fields. Assuming you are doing a comprehensive analysis, you then need to follow these steps to consolidate your data:

  1. List all the disparate databases your organization uses (i.e. Online, Raiser's Edge, Events, Lottery, Email List...)
  2. Include deceased donors, inactive donors, donors with contact restrictions, donors who have moved or have wrong addresses.
  3. Map the fields and identify the source (i.e. Email.RE; Email.Online.

Done! Now you are ready for the data cleanse..

Cleaning your consolidated data

If you don't want to clean your data twice then don't try to de-duplicate until you have cleaned your data properly.  You also don't want to remove any records until you have done the cleaning. 

You should clean your database at least once or twice a year and sometimes as frequently as once before every major mailing. As far as cleaning is concerned here are some  of the key tasks that are of huge benefit:

  1. Address validation: This helps to standardize how you record donor addresses and aligns them to the postal service of your country.
  2. NCOA or National Change of Address: When donors move, they will notify the postal service that they want their new addresses to be placed on this database and thus allow you to update your records.
  3. Add Missing Unit #s: Add missing suite numbers, especially in multi-unit buildings.
  4. Identify deceased: Your postal services will also let you know if a donor has recently passed thus ensuring you maintain up-to-date information about your donors.

This step is now complete, you want to actually start the second part of your data cleaning - de-duping and flagging records for removal.

The de-duplication stage

Now that your data has been standardized, it is easy to compare and work with. In this phase you want to arrange all the data in a particular order to see if any duplication exists. 

Typical fields to check for duplication include:

  1. Name and address
  2. Email or telephone

Merging records where all the data-sets are identical is  pretty foregone conclusion. 

But let's say, you have 3 records with the same address, same last name but the first names are all different like this: 

  1. J. Adams
  2. Janice Adams
  3. John Adams

In this case - you really can't assume that any of these records are one and the same or that two of them aren't the same. You also can't assume the nature of the relationship between the three records. Identifying and flagging cases like these allows you then to implement a strategy to clean your data and implement steps to prevent data duplication at the source.

Merging and purging your data

Now that you have identified possible duplications, your data is ready for the merge purge.  In this stage, the records will be flagged and grouped into the following categories:

  • 1
    Records that don't need any changes
  • 2
    Duplicate  records that require merge
  • 3
    Suspected duplicates + field to check (i.e. First name)
  • 4
    Same address - different last names

In addition to this, you should also have a list of all the records that were marked as deceased, do not mail, and NCOA.

And now you are in a much better place and ready to conduct your analysis or your communications campaign.


The benefits of a data audit are plenty. The biggest reason is that your database is your organization's lifeblood. An unrelialable database can severely hamper your fundraising efforts, causing delays and worse - diminishing your perceived status as a competent and organized charity. Renewing addresses and updating donor information can also help you recover lost supporters and increase your revenue significantly.

Let's Work Together

If you are looking to raise more money and you are interested in learning more about cleaning your data, don't hesitate to connect with me.