Thursday, October 15, 2009

Duplicated Data: A Distressing Dilemma!

You’re frazzled yet again. You have just received an e-mail from one of your organization’s most significant donors, asking why they keep receiving identical pieces of mail from you. This is not the first time this has happened, and you’re sick and tired of the embarrassment. You’re also extremely annoyed because no matter how many times you try to view a donor’s record in your database, you cannot seem to grasp the big picture of him/her. Consequently, it’s been difficult to effectively tailor your fundraising efforts and build better relationships.

If you have ever been responsible for managing a database, either presently or in the past, then chances are that the above scenarios sound all-too-familiar. Data duplication has likely plagued and prevented your organization from operating as efficiently as possible. In order to reach your maximum fundraising potential and build stronger relationships with your constituent base, your duplicate records need to be eliminated! RESolutionsTECH Inc. can help, but first, you should understand more about this pesky problem.

Duplicate records can be defined as either individuals or organizations that have multiple records in your database. Other instances of potential duplication include households, teams, gifts, event registrations, and educational relationships.

The lifecycle of duplicate data can begin in a few different ways. Duplication can occur when adding a single record or importing multiple records into your database. It can also transpire when adding records directly to a backend database, such as MS SQL. Furthermore, it is always possible for duplicate records to already exist in your database without you or your staff even being aware of them. There are also various sources that data duplication can stem from. Collecting data from online web forms, such as online giving or event registration, is one way that this can happen. Another is when data is maintained in multiple locations, like e-mail lists. If your organization maintains the same data in multiple locations (e.g. Online Event Registration System and Raiser’s Edge) or has inconsistent training methods then you are more likely to have duplicated records in your database as well.

To effectively eliminate duplicated records, they must first be identified, then confirmed and finally removed through merging.

Solutions

At this point, you may be asking yourself. “What can be done to prevent duplicated data from reoccurring in my database?” Thankfully, we are data experts and have a wide variety of solutions.

RESolutionsTECH Inc. works hard together with our clients to establish systems which encourage data to be as current as possible. An example of this would be automated e-mails asking constituents to update their own records in your database. We also believe that organizations should increase the safeguards against incorrect data entry in order to create a standardized and consistent end result, like postal and zip codes. Organizations should utilize different levels of duplicate identification criteria. For example, first-level criteria could be Unique Identification Numbers, Date of Birth, SIN/SSN, E-mail, Postal/Zip Code, Unique Username/Password and second-level could be First Name, Last Name, Nicknames, Address Lines, Old Addresses. Making use of nicknames is very effective in handling abbreviated versions of the data. For example, “Rob” could be used instead of “Robert”, “Sue” instead of “Susan”, and so on.

Increased training to all staff members is important to maintain data entry standards based on defined and documented processes. Without proper training, your staff may not know how to conduct searches to see if certain records already exist in the database. We suggest that you provide data entry confirmation prompts that ask people, when inputting data, to verify correct spelling of the information they are providing. A last name of “Carter” could mistakenly be spelled out as “Crter” in a web form, without proper verification. To avoid confusion, your organization could prevent alternate addresses from being used by the same person, unless they are properly defined as such. Someone registering for an event could use his/her business address and then later, when giving a gift online, could alternately provide his/her home address.

Fuzzy identification is extremely beneficial in locating different levels of duplicate matching criteria. With fuzzy identification, duplicates will be identified where there was a typo, misspelled word, or something similar. In other words, you do not have to match criteria up word for word or letter by letter. “Rob” and “Robin” in the First Name field of a database would return a match, but if you were only utilizing identical field matches, this would not be the case.

Constituent records in databases should be assigned unique identification numbers (IDs) that can be shared between multiple databases. Your organization should also track the sources of all data so you can be aware of the fact that, for example, a constituent’s address was submitted using a web form, an e-mail, the telephone, etc. It is a good idea to maintain a log of all changes that are made to each constituent’s record. This provides you with an audit trail of who changed which data, when this change took place, and what was changed.

Your organization should define the master of each data element (e.g. Donor Management System, Student Information System, Patient Information System, etc). An example of a data element would be the Gift Amount where it is important to know in which database to manually change or adjust the amount if needed. Raiser’s Edge or the Donor Management System you are using is typically the master of this data element. This is important to be aware of because the master of data controls any changes required. Manual data entry adjustments should not typically be made in non-master locations but rather updated automatically or through an import process. This process enforces a single point of entry and ensures all shared data is in synch.

Address Accuracy processing, which RESolutionsTECH Inc. can perform for you however often you need, will provide cleanups to standardize addresses, find national change of addresses, identify individual and household duplicates, and confirm or add new phone numbers. Other de-duplication tools such as Excel, Access, scripts, Fuzzy Dupes, AbleBits, and duplicate identification reporting (e.g. Raiser’s Edge Duplicate Information Reporting) should be considered for use. Furthermore, it is beneficial to implement the use of duplicate merge utilities such as Blackbaud plug-ins, third-party duplicate merge utilities, SalesForce Application Exchange add-ins, etc.

Client Stories

With the help of RESolutionsTECH Inc., the following organizations have all benefitted tremendously, in different ways, from our assistance in eliminating and preventing data duplication.

Ovarian Cancer Canada utilizes an online tool to collect event registration fees and gifts from donors. They make regular transfers to their Raiser’s Edge database. The main concern here was that constituents were getting entered as duplicate records into the database when data was moved from their online system to Raiser’s Edge. To correct the process with all future donations and event registrations, we standardized addresses and identified individual and household duplicates. We then used a tool called My Data Mover to transform the data, confirm duplicate matches, and import data into Raiser’s Edge. In this process, it was the option of eyeballing identified duplicates to confirm and making use of first and second level duplicate criteria that made this process so successful. For all past duplicates created, RESolutionsTECH Inc. consolidated all of their data, from several different locations, into one central and fully-integrated Raiser’s Edge database. According to Abby Heidebrecht, the Donor Relations Coordinator at Ovarian Cancer Canada, “We struggled for a year with our database before we found Robin who came in, cleaned up the database and set it straight.”

At St. Lawrence College, graduate data is moved from the Student Information System to their Raiser’s Edge database a few times per year. In this situation, RESolutionsTECH Inc. helped determine that the Student Information System is the master of data in terms of who graduated and which program they graduated from. Therefore, the Alumni Information System should always synchronize to match the Student Information System for graduates and educational relationships. If duplicates are identified in Raiser’s Edge, this means the duplication exists in the Student Information System, where they are then brought to the attention of those who are managing the Student Information System to clean-up. Then, those clean-ups trickle down to the Raiser’s Edge database, the non-master. This ensures a single point of entry and maintenance for this type of data and provides accurate reporting from both databases.

Alliance for the Great Lakes utilizes multiple databases with their main one being Raiser’s Edge. In certain situations, such as this one, we find it worthwhile to actually bring duplicates into the database first, where the plan is to then handle duplicate clean-up within Raiser’s Edge. We do this to ensure we have a complete history of all the data before the merging begins. This permits us to have an audit trail, as well, which provides a clear picture of how the duplicated data was handled. We also made sure that Alliance for the Great Lakes had a system implemented to ensure the proper master of data was kept as such. The bottom line for any client is that their needs and requirements are met first where we at RESolutionsTECH can come in at any point and clean things up. We continue to work with Alliance for the Great Lakes to continue to improve the process.

Buy-A-Net is a fairly new user of Raiser’s Edge and relies heavily on volunteer staff to maintain their database. They recently came to us asking why there appears to be many duplicate records in the database when they review the output results of their Query. This is a common problem that we see at many organizations that can be prevented with proper training. The reality here is that Buy-A-Net didn’t have any duplicates at all. The same records were coming up multiple times due to outputting data in Query rather than focusing on output in an Export based on a Query. In a Query, it is best practice to only output a unique identifier such as Constituent ID or Import ID and then sort on that field. You will quickly be able to scan the results to ensure no duplication in your result.

How RESolutionsTECH Inc. Can Help You Succeed

It is absolutely paramount to the success of your organization that steps are taken to merge, eliminate and prevent the duplication of data from reoccurring. When you remove duplicates in your database, it sends a strong message to donors and other constituents that their money is greatly appreciated, and is being spent wisely. If duplicates are not removed, then money will continue to be wasted from actions such as mistakenly sending out identical pieces of mail to the same household more than once. This is completely avoidable. All stakeholders will have access to reporting with a higher level of accuracy if your duplicates are kept under control. As well, an audit trail will give your staff members and stakeholders more confidence in your data. With controlled duplication, fundraising efforts are far more effective. Constituents in your database will each have concise, single records which allow you to grasp the big picture of his or her relationship with your organization. This makes it much easier to determine if people are appropriate prospects for your fundraising campaigns. Finally, less data entry will be required internally if duplicate prevention procedures are implemented. Consequently, your organization will save substantial amounts of time and energy.

In conclusion, with a little help from the database professionals at RESolutionsTECH Inc., your organization will be spared the embarrassment, stress, hassle, time and massive costs associated with duplicated records in your database. Your fundraising efforts will increase in effectiveness and stronger relationships will be built with your constituent base. It is absolutely infeasible to operate as efficiently as possible without eliminating and preventing data duplication. You deserve to be happy, stress-free, and of course, successful! So, do your entire staff a favour and give us a call, toll-free, at 1-866-776-5414, or send us an e-mail at RPorter@RESolutionsTECH.com, to talk about all the steps you can take to help your organization move forward. Trust us, your data will thank you. :)