Addressing the resource challenges of donorcentricity

Getting to an organization that is able to know about its donors and customize communications accordingly is not easy.  We often lack one centralized database that acts as the Truth.  We don’t think we have time to make donor calls to thank people where revenue isn’t attached.  Our budgets are so small that we transcend lean and mean and are now emaciated and ticked off.

But we must start somewhere.  Why?  Remember the old joke about the bear and the sneakers?

For a refresher, two guys are at their campsite when an angry bear comes charging in.  One of the guys immediately bends over to tie his sneakers.  The other one says “You idiot!  You’ll never outrun that thing!”

The guy with the sneakers replies “I don’t need to outrun the bear.  I just need to outrun you.”

So, if you have no better rationale and didn’t read my Monday post about the value of donorcentricity to our business model, remember:

  1. Donors to our organizations donate to other organizations.
  2. Other organizations are doing these types of stewardship activities.
  3. BEAR!

colbertbear

So how do you start this journey of a thousand steps?  Here are some tips to first steps to better talk to donors.

Get your database in order. This may mean some time working out of csv files to get your lists in order.  However, this is much better than not trying at all.  It will also help you in the long run, as the fancy pants SQL/database steps to data health are likely just automated versions of what you are doing in your spreadsheet.

Institutionalize calling.  It doesn’t need to be just development employees or just employees.  But any part of your culture that you can get to call donors to thank them – do it.  Even if it’s one call per month.  The practice of hearing donor stories helps whoever here them take what was once a figure on a spreadsheet and turn it into an understanding of why people outside your organization think you exist.

And it helps them to feel your gratitude as well.

Ditto for thank you notes.  The more these can be a cultural touchpoint, the better.

Try an unconventional thank you strategy.  We have 50 ways to thank your donors here, most of them unconventional and many of them very poorly rhymed.

Finally, once you have data from a good number of people who have randomly received thank you calls or notes or the like, run the numbers.  You should be able to see from an increase in retention rate (I hope) the impact that calling can have on your donors and your retention rate.  Sometimes that number will be enough to continue your random calling.  Sometimes it will be large enough to justify significant resource allocations changes.

After all, the quickest solution to a small budget is to get a big one.  This can help you prove it out.


If you’d like more nonprofit direct marketing content like this, please subscribe to my free weekly newsletter. You’ll get all of my blog posts, plus special subscribers-only content.

Addressing the resource challenges of donorcentricity

Learning from political fundraising: hypercustomization

fireworks4_amkOn the path to his win in Iowa, Ted Cruz took an unusual position for a presidential candidate. He spoke out against fireworks regulations.

Usually, Iowa contests focus on broad national issues that a person would be expected to lead on as president (plus ethanol).  Fireworks range as a national issue somewhere around garbage collection and why-don’t-they-do-something-about-that-tacky-display-of-Christmas-lights-on-Steve-and-Janice’s-house.

But from a data perspective, the Cruz campaign knew its supporters.  There’s a great article on this here.  Here’s a quote:

“They had divided voters by faction, self-identified ideology, religious belief, personality type—creating 150 different clusters of Iowa caucus-goers—down to sixty Iowa Republicans its statistical models showed as likely to share Cruz’s desire to end a state ban on fireworks sales.

Unlike most of his opponents, Cruz has put a voter-contact specialist in charge of his operation, and it shows in nearly every aspect of the campaign he has run thus far and intends to sustain through a long primary season. Cruz, it should be noted, had no public position on Iowa’s fireworks law until his analysts identified sixty votes that could potentially be swayed because of it.”

As we unpack this, there are several lessons we nonprofits can take from this operation:

The leadership role of direct marketing.  Cruz’s campaign is run by a direct marketing specialist.  Contrast this with Marco Rubio’s campaign, which is run by a general consultant, or Jeb Bush’s, which was run by a communications specialist.  As a result, analytics and polling in the campaign are skewed not toward what generalized messages do best with a focus group or are the least offensive to the most number of people.    

In fact, in the campaign, the analytics team has a broader set of responsibilities than normal.  Analytics drive targeting decisions online and offline.

The imperative to know your constituents.  Much political polling is focused on knowing donors in the aggregate.  The Cruz campaign wanted to know them specifically.  So they gathered not just people who were supporters and asked them about local concerns.  This came up with 77 different ideas, including red-light cameras and, as you probably guessed, fireworks bans.  We’ve talked about knowing your constituents by their deeds and by asking them; what’s important about this example is the specificity of the questions.  It’s not “what do you like or dislike”; it’s “what do you care about.”

Testing to know potential constituents.  One the campaign had these ideas, they tested them online with Facebook ads.  The ads weren’t specific to the Cruz campaign, but rather asked people to sign up for more information about that issue.  Once they had these data, they not only had specific knowledge of what people cared about, but the grist for the mill of data operations that could model Iowa voters and their key issues.  

Focusing on actual goals.  Cruz’s end goal is to drive voters, just like ours is to drive donations.  By simplifying things down to what gets people to pull their levers/hit the button/punch the chad, they had a crystallizing focus.  One can debate whether this is a good thing, as the campaign sent out a controversial Voting Violation mailing that attempted to shame infrequent voters with Cruz leanings to the polls.  (It should be noted that these mailings are the part of campaign lore — they’ve been tested and found to be very efficient, but few campaigns have ever wanted to backlash that comes inevitably from them.)  But that focus on things that matter, rather than vanity metrics like Facebook likes , help with strategy.

Hypertargeting: All of this led to some of the most targeted direct marketing that has been seen in the political world.  When telemarketing was employed for particular voters, not only would the message reflect what they cared about (e.g., fireworks bans) but also why they cared about it (e.g., missed fun at 4th of July versus what seems to some as an arbitrary attack on liberty).  This came from both people’s own survey results and what models indicated would matter to them.

So now, let’s look at this in a nonprofit direct marketing context.  How well do you know your donors and potential donors?  Or how well do you really know them?  And how well do you play that back to them?

I’ve frequently advocated here playing back tactics to donors that we know work for them and focusing our efforts on mission areas and activities we know they will support at a segment level.

But this is a different game altogether.  The ability to project not only what someone will support, but why they well, and designing mail pieces, call scripts, and emails that touch their hearts will be a critical part of what we do.  And once you have this information, it’s cheap to do: if you are sending a mail piece or making a phone call already, it’s simplicity itself to change out key paragraphs that will make the difference in the donation decision.

This also applies in efforts to get donors to transition from one-time giving to monthly giving or mid-major gift programs.

So, how can you, today, get smarter about your donors and show them you are smarter about them?

Learning from political fundraising: hypercustomization

Learning from political fundraising: combined databases

This is a lesson from something the 2008 Obama campaign got wrong online.

I know, it’s blasphemy.  The 2008 Obama campaign was so far ahead on digital fundraising that you could call what we are doing even now eight Internet years later (which is 576 regular years) as evolutions from that model, rather than subsequent revolutions.

I know I got questions from board members at the time as to why we couldn’t deliver the same type of Internet fundraising progress as that campaign.  (These questions dissipated after they learned of the price tag.)

And for perhaps the first time ever, political marketing was ahead of commercial marketing: witness Obama campaign veterans going to work for private industry post-election.  

But there was a massive problem with the back end of the Obama e-juggernaut: multiple different databases.

I’ve railed against this before, arguing that you need one database that is the Truth. Even if there are databases that feed in, some system has to be the one you go to get every record with enough detail on it to be able to work with it for donor relations and basic communications.

The Obama 2012 tech team did an illuminating set of interviews with Time to be released after the 2012 election.  The article is fascinating; here’s a salient excerpt:

Back then [2008], volunteers making phone calls through the Obama website were working off lists that differed from the lists used by callers in the campaign office. Get-out-the-vote lists were never reconciled with fundraising lists. It was like the FBI and the CIA before 9/11: the two camps never shared data. “We analyzed very early that the problem in Democratic politics was you had databases all over the place,” said one of the officials. “None of them talked to each other.”

So over the first 18 months, the campaign started over, creating a single massive system that could merge the information collected from pollsters, fundraisers, field workers and consumer databases as well as social-media and mobile contacts with the main Democratic voter files in the swing states.

This probably sounds familiar, no?  You can feel the resources being wasted.  If a get-out-the-vote canvasser doesn’t have the donor list, you could be asking a maxed-out Obama donor if they plan to vote as if they were a person off the street.  Likewise, a passionate supporter met while doing GOTV may not make it into the mail or online databases.

As we work toward a world of multichannel marketing, it is destructive to have data silos.  Your telemarketers need to be able to get information about mail and online donations (no sense calling the person to renew their membership when you received their check or debited their card yesterday).

Now look at the line in this piece that should send shivers down your spine: “So over the first 18 months, the campaign started over.

That’s a difference between the political world and the nonprofit world: there are lulls in the political world (not many, not for long, and fully compensated for by the frenzy of election years, but they do exist).  For nonprofits, you need to build your new plane while you are flying it.

But that’s no excuse for not having the data structure you want and need firmly in your mind and continuing to drive for it.  I personally have been on a crusade with an organization for almost a decade where we have been killing off databases gradually as we are able to assimilate them.  It’s not stopping everything to recreate the database, but it’s continual forward progress.

So, what can you do to avoid Obama 2008’s horrid fate (he said, tongue firmly in cheek)?  It’s twofold: know where you want to get and move toward it, all while you continue to do your job in the meantime.

One thing I didn’t mention above is you need these data to construct models of donor behavior: figuring out not only who supports you now, but why and who else may be willing to join.  We’ll talk about this more tomorrow looking at the Cruz 2016 Iowa campaign.

Learning from political fundraising: combined databases

Easter eggs in your donor database (guest post)

I have the privilege of sharing a guest post from Angela Struebing, president of CDR Fundraising Group.  For more insights from Angela and the CDR team, you can try their blog here.  Thanks, Angela!

eastereggs

Every year I organize our neighborhood Easter Egg hunt. I stuff and hide over 600 eggs and love watching kids run through the field searching for them. The excitement they feel when finding an egg is the same rush I get when I discover something actionable in a client data file. It got me thinking about some data eggs that are often hidden. For some you have to look a little harder, but the answers are always in the data.

  • When evaluating list performance look past initial response metrics and assess long-term value (LTV) at an individual list level. We often find that lists that look bad upfront may show life when looking at 12-month or 18-month payback periods or retention rates. The same goes for looking at LTV by package. A test that might have had a lower response initially may bring on more loyal donors over the long haul. Make sure you look well beyond just campaign reports for this information.
  • Along the same lines, matchbacks where you look at returns that may be coming in through one channel but driven by another, is another hidden gem in your file. This is especially true for brick and mortar institutions where a recipient gets a mail piece and can respond through the mail, via phone, online, or in the lobby. In order to gauge true list value, you’ll want to look at all response channels and see from where the response was driven. This will also encourage you to make it as easy as possible for donors to give through any channel.
  • This leads us to multi-channel migration and attribution analysis. You’ll want to understand if donors are migrating from online to offline or offline to online. While counterintuitive, we see more people giving an initial gift and then moving to offline giving than vice versa. Knowing this may change your marketing focus. Attribution is critical to making investment decisions and understanding how the various channels are working together.
  • I find lapsed donors particularly interesting and profitable. They have already exhibited an interest in your mission. They can usually be reactivated for less than it costs to find a new donor and are more valuable to an organization (based on number of gifts and average gift). Take the time to test what really works with lapsed segments. Do they perform better in acquisition or to housefile packages or perhaps a tailored lapsed package? All lapsed cohorts aren’t the same with deep and recent lapsed names performing very differently. Should you use a reduced ask, Most Recent Contribution vs. Highest Previous Contribution or generic acquisition string? Do you reference their previous relationship or – if they’ve been absent long enough – treat them as a prospect? How far back can you mail? All of the answers to these questions can be found within your database (and carefully crafted tests).

These are just a few of the things I go looking after when reviewing results and file trends. What hidden gems have you found? Happy Hunting!


C80B9596crcloseANGELA
Angela Struebing is president of CDR Fundraising Group, a multichannel agency focused on helping nonprofits maximize their online, direct mail, telemarketing and DRTV fundraising results. As president, Angela is responsible for overall agency management and strategic planning for national nonprofit clients to include Shriners’ Hospitals for Children, MoMA and the Marine Toys for Tots Foundation.

 

Easter eggs in your donor database (guest post)

Welcome step four: Setting up your systems

So you have your plan for your welcome series.  It is somewhere between 1 and n number of communications, depending on the person.  It crosses media where possible.  It thanks, learns, teaches, and asks.  And it honors the gift the person has given, while letting them know they can still be a bigger part of the change they seek to make in the world.

And it is worth nothing unless it is written down.

eyrha

You are working with a process that likely has:

  • At least three media (email, mail, phone) and perhaps more (mobile/texting, addressable ads, video, events)
  • Multiple vendors/systems involved (including caging, database, mail house, telemarketers, online communication systems)
  • Multiple points of differentiation, including medium, message, and high-touch v low-touch
  • Multiple people at your organization (you, donor relations staff, executives)
  • Intricate timelines.  For example, if you have three communications that you want in the order of thank you, learning about you/you learning about us, and ask, you really, really don’t want them to happen in the reverse order.  Also, every time you suppress someone from anything, things get complex.

This is not something that can be informal. In order for these systems to work together, you need to write out how.

Which does not mean you should write it in stone.  The basic principles should be (thank as quickly as possible, customize communications to the person receiving them, include both gratitude for what the person has done and opportunity to do more).  But how you accomplish them should be fluid with your testing regime.

I would say that the easiest way to create your process is to start with the simplest case and work your way up.  In this case, we have the rare example where ontogeny recapitulates phylogeny*.  So here’s how to build your program and your welcome/acknowledgment flowchart.

  1. Start with the most basic thank you by medium.  For an online gift, this is an email thank you; for an offline one, it is a mailed thank you.  Figure out how to get it out as soon as possible. 
  2. Customize those communications to the reason for giving.  A simple way to do this is to have a paragraph in the communication that changes based on the appeal to which the person gave their initial gift (since that’s usually the only information you have about the person at this point). 
  3. Create a special high-touch way to reach out to higher-potential-value donors.  This could be a policy of thanking all sustaining donors by phone or thanking all $100+ new donors and $1000+ existing donors with a handwritten card — whatever you are capable of doing.  This should be added to, not in place of, other thank yous. 
  4. Explore ways to break down your acknowledgement silos and thank people in different ways.  Put those that work into your process. 
  5. Add in a customized ask.  Yes, we’ve gotten this far before adding in an ask.  My thought is that a well-thanked donor is more likely to give to a regular-communication-stream ask than a poorly thanked one is to give to a specialized communication.  Also, you’ll note that this comes before creating a gap in communications (there likely already is one  that you can take advantage of) or learning about/educating your donor (I would rather have a less-educated donor who makes a second gift than a more-educated one who hasn’t).  This ask will be a bit more generic than we would like at this point, but you crawl before you walk. 
  6. Create your communication(s) to learn about your constituents.  These will usually be, but don’t necessarily have to be, separate communications from your acknowledgement and/or ask. 
  7. Create your if/then tree for customization from these learning communications.  That is, you should have something that says “if they are interested in advocacy, send them X paragraph in the ask; if they are interested in conservation, send them Y; if we don’t know, send them Z.” 
  8. Create the systems by which these changes will be implemented both for the ask as part of the welcome series and for all future communications. 
  9. Add communications from other media to the mix. 
  10. Create your timing for all of these communications, expressed in number of dates from the receipt of the gift.  I would encourage you to do a range, rather than an exact date for these communications — you may want to avoid having people telemarketed to on Christmas or on Sundays, for example.

Then, test the everlovin’ crud out of the system.  You are looking to break your system and then make it stronger at the broken places.  Some common things to test:

  • Do you have a plan, and only one plan, for every giving amount?  I’ve seen plans that say that donors over $100 get this communication and donors under $100 get this other communication.  They forget that a computer is going to be looking at this and ignoring people who give exactly $100. 
  • Do you have a plan for defaults?  Remember in most cases, you are not going to have additional information from the donor when you make your welcome series ask.  You want to make sure there isn’t a big blank space where paragraph three should be. 
  • What happens to your system if someone miskeys a code? 
  • What does your flowchart look like if someone does everything?  That is, you have paragraphs for people who are interested in various particular diseases, want to do advocacy, or have a personal connection to the mission.  What if they are all three, and they are a high-dollar donor?  The goal here would be to make sure you have prioritization and that you are not inundated with communications.  Remember that one of the priorities with your welcome series is to help the donor understand what to expect from you.  This should not be “I will expect to be annoyed.” 
  • How do your dates line up?  If you are integrating multiple messages and channels, you want to make sure that a person doesn’t get a phone call, mail piece, and email all exactly 21 days after their gift. 
  • How are you going to be able to handle the load?  That is, if you are going to be sending a getting to know you email seven days after the gift, will you be able to handle that on January 7th given your December 31st volume?  What?  You don’t have huge December 31st volume?  Let’s do a week on year-end fundraising at some point.

And you want to be vigilant to potential leaks even when you have this written down.  I have the privilege of working with a great donor relations person who keeps me apprised of the tone, tenor, and quantity of calls we get.  From this, she was able to discern that we were getting people calling (from a pattern of three people — like I said, she’s good) that already existing donors were getting member cards, something we include in our getting-to-know-you section of our welcome series.

What had happened was that our caging vendor had had instructions to send the new donor welcome letter to people who came in from acquisition mailings.  Since acquisition mailings often have lapsed donors in them that you are looking to reacquire, there were people who had donated for over 20 years who were being treated as if they were brand new to the organization.

Sovigilance

Image credit.

I will say that I have entirely failed to set up a welcome series for the weekly newsletter companion to this blog.  If you were to sign up for the newsletter here, what would you like to see?  Email me at nick@directtodonor.com; I’d love to use this as a test case.


* “Ontogeny recapitulates phylogeny” is the (mistaken) idea that we go through all of the stages of our biological evolution in development of our embryos.  The catchiest treatment of this I’ve seen is in Stephen Jay Gould’s I Have Landed, where he talks about how the drawings that supposedly prove this theory (like the allegation that we have gill slits at a point in the womb) persist in science textbooks.  It’s also the theme of his first book, which is a technical book and thus one I haven’t read.  In any case, I doubt his first book has any essays on Gilbert and Sullivan, which I Have Landed does, so that’s another point for the latter work.

Welcome step four: Setting up your systems

Getting to the Truth of one database

the-one-ring

One Database to rule them all.
One Database to find them.
One Database to bring them all
And in the darkness bind them.*

A beloved former boss of mine once asked the best question I’ve even heard and may ever hear about databases: “Which database is going to be the Truth?”

Others may call this the database of record, but the Truth is far more evocative.  It encompasses “which database will have all of our people?”, “which database will have all of our donations regardless of source?”, and “which database will be the arbiter and tie-breaker for all constituent record issues?”

This is a necessary pre-condition of donor knowledge.  You will not have true knowledge of a constituent of all of your data isn’t all in one place.  And working on donor information without the backend systems to back it up could be a waste of time and effort.

If you are like most nonprofits, you are either laughing or crying at the discussion of one database.  You likely have a few different donor databases by donation type.  Then you have records of people you serve, your email list, your event attendees, and so on.

And, sadly, some of them are necessary.  Some databases do things that other databases will not do.  You may not be able to run your direct mail program out of your online database or vice versa.

So here are some steps you can take to get all of your information in one Truth even if there are multiple databases behind it:

Purge unnecessary databases.  And I mean purge them. Ideally it should be as if your unnecessary database displeased Stalin: it just disappears from history, incorporated into other people’s stories.  To do that:

  • Ask whether another database can do what this database does.  If so, bring the data over and train the relevant parties.  The good news is that often the rogue database in question is just an Excel spreadsheet that can be directly imported into your database of choice.
  • Ask whether another database can do what this database does with modifications.  Rarely is something perfect initially.  You will likely have to create reports for people that they are used to running, but if you are bringing them into a good database, that’s a matter of business rules and set-up, rather than technical fixes.
  • If not, ask if the person can do without what the database can’t do.  You’d be surprised how many things are done because they have been done rather than for any rational reason.

Assuming that you have some databases that can’t be replicated in one big happy database, decide what database is going to be the Truth.  This should have the capacity to store all of your fields, run reports, and do basic data entry.  If you are keeping your direct marketing database, it doesn’t need to be able to run a direct marketing program.  But it does need to have the capacity to do the basic functions.

You may say that you don’t have a database that can fulfill this function.  In that case, I would recommend what I call a Traffic Cop database.  This is a database that you can inexpensively put in the center of multiple databases and get data to and from the other databases.  It’s job is to make sure every database knows what every other database is doing and existing to pull out duplicates and host change management.

Now, sync the databases to the Truth database.  Sometimes you may be fortunate and be using a database that has existing linkages.  For example, if you have decided that SalesForce is going to be your Truth, there are some pre-existing syncs you can get from their apps.  If not:

  • Start by syncing manually.  That is, export a report from one database and import it into the other.  Then, reverse (if you keeping a database, syncing it has to go both ways).  This will allow you to figure out what fields go where and more importantly how to translate from one database to the other (e.g., some databases want the date to be formatted 01/18/2016 and woe be unto you if you forget the zero before the one; others may not having a leading zero or have month and date as separate fields or the like).
  • After you have your process down, you can automate.  This can happen one of two ways: through the database’s APIs or through an automated report from one database that uploads to a location followed by an automated import from the other database.  Both are viable solutions — you would generally prefer the API solution, but you do what you have to do.
  • Make sure you have an effective deduplication process.  It almost goes without saying (and if it doesn’t, check out our PSA for data hygiene here), but data can get messy quickly if you don’t have these in place.

Here are some of those common objections and the easiest replies:

  • Cost: “how can we afford to take on a database project?”  Answer: how can we afford not to?  The lost donations from people calling you up asking for a refund and you have to look through five different databases to see where they donate.  The extra time to try to reconcile your donor database and financial systems.  The data that you won’t be able to get or use for your direct marketing and the lost revenues from that.
  • No direct marketing constituents: “I don’t want X (usually the people we serve) to get hit up for donations.”  Answer: We won’t be able to guarantee they won’t get a solicitation unless we know who they are.  We rent acquisition lists all the time and these people could be on there.
  • We’ve already invested in this other database: Answer: point them to this Wikipedia page.  It’s easier than trying to explain sunk costs on your own.
  • Provincialism: “We have database X and it works fine for us.” Answer: actually there are three answers for this one.  First, start elsewhere.  Usually, someone will have a database that isn’t working for them and better you start with them, who will then start singing the praises of both you and the Truth, than with the people who like where they are currently.  Second, usually, there is an “I wish we could do X” list somewhere that will make it worth this person’s time to switch.  Third, go to the highers-up with your business case.  By this time, you hopefully have some happy converters and some results from your direct marketing program (e.g., “we can put the year someone started with us on their member card now!”) to share.

Hopefully, this helps you get to your own version of the Truth.  Now that you have it, let’s talk about what to put in there.  That’s our charter for the rest of the week.

* Since we started with Game of Thrones yesterday, we have to do Lord of the Rings today…

Getting to the Truth of one database

Appending data in Excel

One of the primary reasons to have a database is to have data in various tables.  A simple example is a database with a people table – a list of your donors – and a donations table.  This allows for there to be a one to many relationship between the donors and donations.  After all, a good donor will generally donate multiple times.  By using a key in the people table, each donation is marked with that same key – e.g., all donations from donor 12398 have 12398 in their donor field and are associated with that record in the people table.

Excel can’t do this.  As a flat file, all you can represent are one-to-one relationships.  If you had to have multiple gifts in Excel, you would have duplicate donor records for each instance of the gift.

But you can append data to a record in Excel.

A while back, I wanted to do some homemade modeling on a file to add sex in as a variable (male/female in this case, to simplify the model).  Unfortunately, I didn’t have a list of who was male or female in this particular file, so I used US census records to determine how many males and females there were with a given first name.  This led to the odd conclusion that someone who had a name that 60 females and 40 males have in the US is a 60% woman.  Thank goodness I was using this for modeling and not for assigning Mr. or Ms.

By the way, if you were wondering if there were any perfect half-female, half-male names, there were three: Ariel, Hong, and Kris.  Sydney, Robbie, Frankie, and Kerry were almost androgynous, just missing the cut.  The Excel sheet is here if you are curious.

To complete the model, I had to put the percentage of femaleness with each record.  I used a somewhat fiddly function called VLOOKUP.

VLOOKUP tells Excel to look through a bunch of rows to find a particular value in the left column of an array, find it, and return another column in that array.  To wit:

=VLOOKUP(A1, ‘[Male v Female first names.xlsx]Sheet1’!$A:$B,2,FALSE)

This assumes that A1 is the first name you want to look up.  This then goes over to the document called Male v Female first names and looks at the first two columns, A & B.  It searches for the name in column A.  If it finds it, it returns back the value in B next to that value.  So, if A1 is Hong, it finds Hong in the array, sees the value next to that is 50%, and puts it in the cell in question.  The FALSE at the end just means you want an exact match, rather than the closest one.

It is by no means as pretty or efficient as a SQL JOIN statement.  In fact, if you have a lot of these to do, as I did, copy and paste it for the full column, then get yourself some tea and a crossword puzzle while Excel labors like a hamster on a wheel.  But, on the plus side, you don’t have to learn SQL.  You still should, and we’ll cover SQL for direct marketers at some point, but you can’t learn everything immediately.

Appending data in Excel

A brief instructional video about the benefits of data hygiene

(Billy is walking to school.  He is in black and white, but mostly white, between is grey blond hair, grey pale skin, grey khaki pants, and white white button up shirt.  Only his shoes show black – certainly not his small town.)

Narrator: Woah.  Hold on there, Billy.

(Billy stops and looks for the voice).

Narrator: Where you going in such a hurry?

(Billy holds up the books he is carrying and taps on his watch with a finger)

Narrator: Ah, going to school.  And I suppose you haven’t even given a thought to DATA HYGIENE?

(Billy pauses, has a quizzical look, and proceeds to sniff his armpits.)

Narrator: No, not that type of hygiene.  DATA hygiene.  One of the things that makes America great (cut to stock footage of a flag waving and other good and wholesome things) is Americans doing positive, American things for other Americans, helping our neighbors, our pets, and our environment.  For example, elderly people! Puppies! Parks!  All of this is possible because of our positive, American civic associations of good people.

(Cut back to Billy nodding enthusiastically.  If Billy were tragically stuck inside a paper bag, he could not act his way out)

Narrator: But what if ALL OF THIS CAME TO AN END?  Were there no data hygiene, our civic associations won’t be able to help preserve our positive American way of life?  They wouldn’t be able to reach new volunteers, get voluntary donations, or help out in your neighborhood.

(The previous images of an elderly person, a puppy, and a park, all disappear in rapid succession)

Narrator: THIS WOULD LEAD TO COMMUNISM!

Billy: Holy shit!

Narrator: Wait, you can talk? I thought we had you muted.  Why didn’t you talk before?

(Billy shrugs.)

Narrator: There are two types of data: data that you control and user-entered data.  Let’s take a tour of data you control.

(Billy is whisked away to an average city street)

Narrator: Here is the hustle and bustle of a nonprofit organization’s traffic.  All sorts of data come through here on a daily basis, from caging vendors to development officers to syncs with other databases.  And we can control these with some basic community standards.  But you have to make sure these community standards are followed, from address 1 line standards…

(close-up on the street sign, which magically changes from Mockingbird Lane to Mockingbird Ln)

…to address line 2 standards…

(close-up on an apartment, labeled as 123 Main Street, Apartment 3-B, which changes to
123 Main St
Apt 3B)

…to pick lists that dictate what titles and suffixes can be used.

(A mailbox that says Reverend Timothy Lovejoy, III, changes to Rev Timothy Lovejoy III).

Narrator: It’s better that these things are cleaned up in the database…

(Billy opens a door and see workers in a dank, grey office working on various spreadsheets.  A supervisor is whipping them if they don’t clean lists fast enough.  He recoils in horror.)

So that hard-working Americans don’t have to fix them later.  Fortunately, you have professionals doing data entry here, so that should limit duplication and errors.  The same can’t be said for…

(Click. Billy is instantly transported to a seedy red light district.  Crude graffiti adorn the spare brick walls and trash lays, some in bags, some not along the street).

Narrator: USER GENERATED DATA!

(A scantily clad woman of the night beckons to Billy.  He starts to walk over)

Narrator: Billy! What do you think you are doing?

(Billy stops and gives the camera a sheepish shrug).

Narrator: Here in user-generated data, anything goes.  In order to get people to fill out forms, you have to make it very easy.  Most of the time, that means leaving out items on the form…

(a metal mailbox is embedded in the wall.  It says simply “Jim”.)

… not validating standard fields or following capitalization rules…

(the neighbor’s mailbox is labeled “mt. cHris hartnet, mbs”

… having people put data wherever…

(the camera pans up to reveal a steel mill in the background labeled “Dr. Alice Humphries, DDS”)

… and just plain having jokers plant fake information.

(the building next to the mill is labeled “I. P. Freely.”  Billy looks frightened.)

Narrator: Yes, this lack of cleanliness is the price you pay for freedom.  American freedom!  But we all have to do our part, or else the civic organizations won’t be able to reach anyone in this neighborhood.  And you know what that means, don’t you Billy?

(Billy nods and mouths the word “communism”)

Narrator: That’s right.  Here we have to use more broad hygiene techniques.  Enforce your capitalization rules, not theirs.  Every one of these houses should be run for address standardization, so the addresses are correct.  They should also go through strict National Change of Address forms, so you know when someone has moved.  If someone’s email address doesn’t work, make sure to suppress it or you will be labelled a spammer.  And you know what happens to spammers.

(The camera shows a man in a stockade.  Residents are throwing cans of Spam at him.)

Narrator: Appending data like telephone numbers, addresses, and email address when you don’t have them is also usually worth the investment.  That way, you know if Jim back there is the same Jim who just wrote your nonprofit a big check.

(Click. Billy is back on his school grounds.)

Narrator: So, Billy, you know the importance of data hygiene, right?

(Billy nods enthusiastically and runs off camera)

Narrator: Data hygiene: it’s the right thing to do.  America.

A brief instructional video about the benefits of data hygiene

Deduplication 2: Deduplicate harder

The deduplication exercise from yesterday talked about how to deduplicate clean, easy-to-use data.  If you ever find data like that, please let me know.

In reality, there are always quirks in the data to work around.  Here’s how to solve a few of them.

The middle name conundrum.  There is no good way to deal with middle names in a database, because the way people use them is non-standard.  For some people, a middle name is never used and sometimes a throwback to a bygone ancestor (e.g. ,Ronald Bilius Weasley).  For some, it is an integral part of the first name (e.g. Mary Kate and Ashley Olson; that is just one long name, right?).  Others skip right over the first name and go right to the middle (e.g., F. Murray Abraham).  This is why a good database will have a salutations field you can fill in, so your letters go to Ron, Mary Kate, and Murray, and not Ronald, Mary, and F.

This, however, makes deduplication a bear.  You want to preserve the original first name, but need to manipulate the data:

  1. Insert two columns next to your first name row (more if you think people will have three or more names in the first name field)
  2. Copy (Ctrl + C) the first name column into the first empty column.
  3. Highlight the column and go to Data > Text to Columns
  4. Choose Delimited and hit Next
  5. Choose Space and hit Finish

This separates the names into separate columns.  Dedup as normal based on these fake name columns, then delete these columns and you have clean(er) data.

Forms of names. An automated system won’t be able to tell you that Robert and Bob are the same person, but you can figure out close matches of names like Robert and Rob.  The trick here is when you deduplicate for first name to put

=LEFT(A2,3)=LEFT(A3,3)

Instead of

=A2=A3

 This compares the three first letters of cell A2 to the first three letters of cell A3.  This solves your Rob/Robert’s and Nick/Nicholas’s of the world.  If you want to be a little less restrictive, you can do two instead of three and solve for your Nicole/Nikki’s and your Jo/Joanne’s.

This handy trick also works well for addresses.  Generally if the first 4-5 digits of the address are the same, they are close enough for government work (but still catches them if someone put Rd instead of Dr).

Adding as you go

Because Excel is a flat file and not an actual database, you will frequently have to sum gifts to get a total amount of giving.  Let’s go back to our initial email list, now with gift amounts and the middle names broken out.  Columns F, G, and H, are set up to compare A, C, and D.

thirdexcellist

Here’s how to proceed:

  1. Set up column I with the statement =AND(F2,G2,H2) (and copy it for all of the rows). This returns TRUE if all of the columns are TRUE and FALSE if any are.  This puts in one cell whether something is a duplicate or not.
  2. Set up the first row of column J with the statement =IF(I1,J1+E2,E2). This is the trickiest bit.  What this is saying is “Look at the row before this.  If it’s the same as this row, add this donation to all of the previous donations that match this record.  If not, just put in the one gift.”
  3. Copy this column and paste values like you did with the TRUE/FALSE earlier.

fourthexcellist

Now, when you deduplicate, you’ll be getting the total amount of gifts for a donor, summed up.

There are certainly not perfect.  But to do perfect, you need a database.

Deduplication 2: Deduplicate harder

Your trapezoid zero database – a spreadsheet

Why trapezoid zero? Because square one is too dignified.

If you are like most nonprofit direct marketers, you have one true donor database that filters in all of the data you need from various applications, gives you a view into what every constituent needs and wants, and allows you to segment and communicate with them instantly.

Then you wake up from your dream. You perform requisite amounts of personal grooming, cover your dirty shameful bits with clothing sufficiently uncomfortable to convey gravitas, go to work, and stare at the latest list you have pulled in Excel.

Excel is not a bad program per se and has gotten even better since Clippy was taken out for some cannoli in downtown Redmond and never heard from again.  The trick is that it is not intended to do what we do with it for address lists and the like.  For that, Microsoft created Access, which is actually a bad program per se.

I promise to talk about using Excel for its intended purpose, but first, here’s how to manage your lists in Excel, starting with basic deduplication.

edharris

Deduplication could effectively be called Sisyphean except some is pushing the blinkered rock down as you are trying to push up. You are never, never done deduplicating, especially given that the most common records for deduplication are your best supporters (given that they will be engaging the organization from all angles).

So let’s assume you have a very simple list.  You want to remove duplicates from an email list.  Specifically, you are looking for exact duplicates – you don’t mind emailing the same email twice as long as it is for different names and you don’t mind emailing the same person twice as long as it is at two separate addresses.  Here’s your list:

firstexcellist

As you can see, there are a few Blofelds in there to get rid of.  Your list will be longer and less menacing.

  1. Select all by hitting Ctrl + A
  2. Go to Data > Sort
  3. Sort by all three fields in any order
  4. Now, you are going to compare each row to the one below it. Go to the top row and the first empty column and put in =A2=A3.  This creates a TRUE/FALSE statement where anything that comes back TRUE is identical.
  5. Drag the lower right corner of the cell you just created down so it covers all rows and over to the right so it covers as many columns as you have. It will look something like this:
  6. Highlight the TRUE/FALSE columns, hit Ctrl + C to copy the columns, then right-click on the same columns to bring up the special menu. Select Paste Values.  What you are doing here is replacing your TRUE/FALSE statements with the results of those statements.  Because you are going to be sorting by these, you don’t want Excel recalculating.
  7. Select all by hitting Ctrl + A
  8. Go to Data > Sort
  9. Sort by the three TRUE/FALSE columns. The output will look something like this.

    secondexcellist
    As you can see, there are two records in rows 14 and 15 that were exact matches for other records in the database, since they have three columns of TRUE.  Those are your duplicates and you can delete them.  One might say you have a license to delete them.

That works fine for simple data.  But how often does that happen?  Tomorrow will be some advanced tips for deduplication in Excel.

Your trapezoid zero database – a spreadsheet