What are the open rates and click-throughs of your mail pieces?

It sounds like a non-sensical question.  And it highlights another major difference between offline and online direct marketing — trackability.

Those who live in the digital marketing space are used to being able to track what happens with their emails and campaigns down to the user level.  They complain when tracking pixels don’t work quite the way they are supposed to on every device and aim for ever better attribution models to understand where their investments are going.

XX Home Maytag B.jpgThose in the offline space are used to sending something out and waiting for results.  And waiting.  And waiting.  

Further, they are used to looking at packages as a whole.  They get one result: did someone donate (OK, two: and how much)?  Because of this, it’s tempting to think of mail testing as the thumbs up or thumbs down as in the Roman coliseum.

But you can find out things like your offline open rates and tweak them to your heart’s content. Take a simple 2X2 testing matrix.

While you won’t be able to tell what your actual open rate was, you can to content yourself with relative open rates.  With online, you have an intuitive feel for whether a 20% open rate is good or bad compared with the emails around it (and whether they generally are opened at 10% or 30%).  This same relative weighing works well in mail.  If 20% more people donating with envelope A than with envelope B all other things being equal, then you have a 20% better open rate with envelope A.

Similarly, if letter C does better than letter D by 30% with the other parts of the mail piece staying constant, you have a 30% better “click-through” rate.

And you probably already know the trick that you only have to test three of the four quadrants here.  If envelope A beats B when they both use letter D and letter C beats D when they both use envelope B, chances are pretty good that the winning test is envelope A with letter C, even though that wasn’t a tested combination.

But what you may not know is the right algorithm can do this writ large with a wide variety of variables.  Ask your vendor(s) if they can run permutations that will allow you to figure out what happens when you five envelopes, four offers, three letter permutations, six different ask strings, and so on.  They should be able to create a variablized stew that helps you run a number of tests at once.

The other thing that I’d recommend is not just taking a page from the online playbook, but using online tools to test your efforts first.  Don’t know if your teaser copy will work well?  Try it as an email subject line or a CPC ad headline first.  While the audiences are a bit different online and offline, catchy is generally catchy and boring is boring.  Working out details like this online can save your testing for things that can actually help you get to know your donor better, leading to more valuable communications and donors.  

(Or, better yet, scrap your teaser copy and test a plain white envelope — it may have the best open rate of all.)

What are the open rates and click-throughs of your mail pieces?

How long should a story be?

Long enough, and no longer.  There!  That was a quick post.

I just realized that I’ve referred many a time about telling quality stories, but haven’t gone into a lot of detail on how.

So that starts today with length of your story.  I like this topic partly because I get to quote Jeff Brooks’ Fundraising’s Guide to Irresistible Communications:

“I’ve tested long against short many times.  In direct mail, the shorter message only does better about 10 percent of the time (a short message does tend to work better for emergency fundraising).

But most often, if you’re looking for a way to improve an appeal, add another page.  Most likely it’ll boost response.  Often in can generate a higher average gift too.

It’s true in email as well, though not as decisively so.”

In addition to emergencies, I’ve personally found shorter to be better with appeals where urgency is a main driver (e.g., reminder of matching gift deadline; advocacy appeals tied to a specific date) and institutional appeals like a membership reminder.

Other than that, length is to be sought, not avoided.

This is counterintuitive; smart people ask why our mail pieces are so long.  And it’s not what people say themselves.  There is a recent donor loyalty study from Abila where they indicate that only 20% of people read five paragraphs in and only seven percent of people are still reading at the ten paragraph market.

Here’s a tip: if you are reading this, this data point is probably not correct.

The challenge with this data point is that they didn’t test this; they asked donors.  Unfortunately, donor surveys are fraught with peril, not the least of which is people stink at understanding what they would do (much better to see what they actually do).  We talked about this when talking about donor surveys that don’t stink.

Other questionable results from this survey include:

  • Allegedly the least important part of an event is “Keep me involved afterward by sending me pictures, statements on the event’s impact, or other news.”  So be sure not to thank your donors or talk to them about the difference they are making in the world!
  • 28% of people would keep donating even if the content they got was vague, was boring, talked about uninteresting programs, had incorrect info about the donors, and was not personalized.  Unfortunately, I’ve sent these appeals and the response rate isn’t that high.
  • 37% of donors like posts to Twitter as a content type.  Only 16% of donors follow nonprofits on social media.  So at least 21% of people want you to talk to them on Twitter, where they aren’t listening?

So length can be a strong driver and should be something you test.  But you want the right type of length.  Avoid longer sentences and paragraphs.  Shorter is easier to understand, and therefore truer.

Instead, delve into rich detail.  Details and active verbs make your stories more memorable.  And that helps create quality length, and not just length for length sake.

And don’t be afraid to repeat yourself in different words.  Familiarity breeds content.  It also helps skimmers get the important points in your piece (which you should be underlining, bolding, calling out, etc.).

This may not seem like the way you would want your communications.  Remember, you are not the donor.  Especially in the mail, donors who donate like to receive and read mail.  Let’s not disappoint.


After posting this, I heard a great line in Content Inc that stories should be like a miniskirt: long enough to cover everything it’s necessary to cover, but short enough to hold interest.  So I had to add that as well…

How long should a story be?

Let’s get small: microimprovements

402px-david_von_michelangeloThere is a story, perhaps apocryphal, that someone watched Michelangelo retouching every inch of one of this statues.  The bystander asked him why he bothered with such trifles; the artist replied “Trifles make perfection. And perfection is no trifle.”

In the direct marketing world, it’s difficult to say that there is such a thing as perfection.  You will likely never see, in any quantity, a 100% response rate or open rate.  But our goal is to strive, to seek, to find, and not to yield.

There rarely is an idea that you have that will double the completion of your online donation page.  But you can find 16 ideas that each get you five percent better, each one compounding to double your response.

So without further ago, a few small ideas that may make small (or big) differences.  In no particular order:

Change the color of your donate button to something not approved in your brand guidelines.  It will stick out.  Good.  Things that stick out get clicked on.  When this starts to lose its effectiveness, change it again.

Reduce the size of your download.  A Sprint phone downloads an average of 11 MB per second on 4G .  We can easily design pages with enough extra code and random things to download to cost an extra second.  One second lost means 7% fewer conversions.

That’s probably why water.org has their homepage look like this:

water

But their donation page looks like this:

 

waterdonationpage

Increase customization by a variable.  If you do name, do name and location.  If you do name and location, add in donation history.  Et cetera.  These are more than 5% tactics

Add a small donate bar at the top of your site.  Human Rights Watch reported (at DMA’s DC nonprofit conference) that the below orange bar and a larger orange footer on their site increased donations from the home page by 256%.  Many days, I’d settle for 2.56%.

Go into Google AdWords.  And do what it says to do.  If it recommends splitting up your keywords, it probably knows that doing so will allow you to customize your copy.  Punctuate your headline properly.  It knows that increases click-throughs.  And so on.  It will keep bringing up these opportunities; you just have to act on them.

Try adding a picture.  Not necessarily guaranteed, but a quality picture will usually improve a home page, mailpiece, donation page, content marketing, etc.  I’ve found a significant difference in the traffic I get from blog posts with pictures over those without.  Hence David hanging out at the top of this one.

Call some donors.  Ideally some of your best, but these thank you’s will both help with the donor’s loyalty and give you ideas for things you can try (or stop).

Take some fields off of your donation form.  Phone number?  Ask for that afterward.  If you have the ability to divine city and state from ZIP on your form, go for it.  You are looking to streamline this process.

Similarly, reduce the clicks to get to the donation form.  Hopefully, it’s one or zero (that is, you can start entering info on the Web page).

Remove the navigation from your donation page.  Now is not the time for someone to want to look at your executive’s pictures.  Four tests show improvements from the tiny to the oh-my-goodness here.  

Run a test.  Are those ask amounts correct?  How do you know?  If you are mailing, emailing, or calling with the same thing for 100% of your communications, you are missing out on your 5% opportunities.

Hopefully, one of these gets you 5%.  If it does, please leave it in the comments.  If it doesn’t, please let us know in the comments what did.

Let’s get small: microimprovements

6 common traps in direct marketing budgeting

Direct marketing budgeting seems easy:

  1. Take last year’s budget
  2. Take out the losing communications and replace them with the results of the winners.
  3. Project that the communications will do the same thing as last year.
  4. Profit

And I have had budgets set this way by vendors. However, this overlooks a great deal.  In confession, some of these are things I caught before we put the plan in our organizational budget – some I didn’t.

Changing file.  OK, you may say – we have 1,000 more donors than we did last year.  We’ll assume the communications have the same response rate and average gift as before and just add to our quantity.

Wrong.  You need to look both at the number of people on your file and the lifecycle of that file.  Let’s say that last year, you did a lot of new acquisition (yay!) and your retention rate stunk (boo!).  As a result, your overall number of donors may not have changed much, but your composition is entirely different – you have far more first-time donors who will have lower response and retention rates and far less multi-year and core donors.  See my post about the fallacy of file size and single-size retention rates here for details.

Bottom line, if you assume your new donors will perform as they always have done, you are toast.

Spill in and spill out.  Accountants have a really good reason to artificially cut things off the way they do.  Or so they keep telling me.

The bottom line is with accrual accounting some of your costs will not occur in the year that you are planning to send out a communication.  Likewise, some of the revenues from a campaign will spill out of a year into the next year, especially for longer-lead time media like mail and telemarketing pledges.

It’s sometimes OK to assume that spill in from one year will equal spill out into the next year.  However, changes in file, size of efforts around year end, and when that darn print vendor decides to send you their invoice can all change whether you hit goal or not.

Communication performance.  I had a vendor report that a piece was going to do 4% response rate because that’s what it had averaged over the past three years.  When I dug deeper, the response rate over the previous three years was 5%, then 4%, then 3% (these numbers are fictitious; don’t believe any response rate that doesn’t have a point something).

I would argue this is perhaps a dying communication and that this is more likely to have a 2% response rate than a 4% response rate.  You don’t see that if you are simply averaging previous years’ performances.

Test failures.  If all of your tests are going to work, you are going to have to call them something other than tests.  Most of the time, your tests will not do as well as your control will do, so you can’t account for this by assuming you are get the results of last year’s test winners.

Speaking of…

Roll-out failures.  You had your test last year and it succeed at 95% confidence?  Chances are you if you tested at 25,000 pieces, you tested part of some of your better segments, not across all of the segments.  Perhaps the piece you have tested into is good for your current donor sets, but doesn’t fit with why your lapsed donors originally signed up with your organization.  If that’s half of the audience you were planning to mail to, you will want to dial back your expectations.

Interactions amount communications.  Let’s can you had record online revenues last year, but your mail program fell off and your donor file dwindled.  A good portion of your online donations are likely people who got their mail piece and decide to donate online; thus, you have to see how aspects of your program affect each other.

Hopefully, these help you make your budget; tomorrow, we’ll talk through scenario planning in your budgeting.

6 common traps in direct marketing budgeting

6 intermediate cost-per-click techniques

The original cost-per-click (CPC) search engines did their listings strictly by what you were willing to pay per click.  (I actually used Goto.com for CPC listings, before it become Overture Services, before it became Yahoo! Search Marketing.  Nothing like Internet time to make one feel old).

500004804-03-01

Yes. This was once a thing.  A big thing.

Google’s algorithm, however, takes the quality of the ad and the site into account.  This is partly because you will come back if you have positive experiences on the site and partly because it maximizes profits.  For the same reason that you would look at gross revenue per mail piece/phone contact/email/carrier pigeon instead of just response rate in isolation, Google looks at gross revenue per ad shown as the backbone of its infrastructure.

Thus, it is in your interest to maximize your click-through rate (except in one very special case I’ll discuss on Friday); you can pass your better bidding brethren by beating them on quality.  Hence the focus on things like negative keywords and phrase matching yesterday: you want to get your clicks on as few ads as possible.  An average quality score from Google is a 5.  If you are at a 10, your cost per click goes down by 50%; if you are at a 1, it goes up by 400%.

Targeting smarter also helps you get clicks from the people from whom you want to get clicks, instead of those who didn’t understand what they were getting into from your ad.

So here are a few techniques to help get to the next level of pay-per-click success:

Check in on your keywords regularly. This should be at least weekly; daily would be better.  It doesn’t have to be for long, but Google will keep giving you helpful tips on additional strategies and keywords to try.  You can also see what is performing and what isn’t, retooling ad copy for underperforming ads and learning which landing pages aren’t converting as well.

Set up conversion tracking.  In the beginning, Internet advertising was sold in CPM – cost per thousand impressions and the earth was without form, and void.  Then came CPC – cost per click – where you pay for an action, rather than a view.  The ultimate is going to be cost per conversion, where you only pay when you get a donor (or other person you are desiring), and you can set your goals accordingly.  Companies won’t want to do this because they have to rely on you to convert, rather than themselves, but it is semi-inevitable.

You can have this advantage right now if you set up conversion tracking.  You will be able to see how many people convert and, if they give donations, how much you get from the campaign.  Seeing how much you get from a campaign ahead of time, then bidding, is like playing poker with all of the cards face up – it’s remarkable how much better it makes you.

Unbounce your page.  Not every page converts well.  With conversion tracking set up, you can tell if your page is repulsing potential constituents.  Testing with Google solutions or a solution like Optimizely can help you convert more people and lower your CPC costs as your quality score goes up.

Set up dynamic keyword targeting.  A person is more likely to click an ad that has the exact words that they put into the search engine in it.  The trick is that people put all sorts of things into search engines.  With dynamic keyword targeting, it doesn’t matter if they put “rainforest deforestation,” “rain forest deforestation,” “tropical forest deforestation,” “destruction of the rainforest,” “tropic rainforest deforestation,” etc., into the search bar, you can add those specific words into your ad.

Geotarget your ads.  This is especially true if you are a nonprofit with a limited geographic reach.  If you are an early childhood intervention provider in Dallas, you likely don’t want Seattle searchers.  However, this applies even to national and international nonprofits.  If you have chapters, or state-specific content, you can direct those specific searchers to the area more relevant for them.  This works especially well for things like walks and other events, where people will likely only come from a certain distance around to the event.

Go for broke.  If you do get a Google Grant, try to use every cent.  Not only will it get you more traffic, more constituents, and more donors, but it will also allow you to apply for more money.  Your first steps to worldwide nonprofit domination await.

I hope these are helpful.  Please leave any tips you’ve found useful in the comments section below.

6 intermediate cost-per-click techniques

Testing beyond individual communications

So far, the testing that I’ve discussed is how to optimize a communication or overall messaging.  The next step is trying to answer fundamental questions about the nature of your program – things like how many times to communicate and through what means.

There is a pretty good chance that you are not communicating enough to many of your constituents.

But wait, you say.  We send out a mail piece a month, have multiple telemarketing cycles per year, and have both a monthly e-newsletter and semi-frequent emails on other topics.  Our board members and staff who are on our seed lists are consistently on me, you say, that we are communicating too much.  And we get donors who complain that they are getting a mail piece before their last one was acknowledged.

However, remember in the discussion of segmentation that more donors are saying their nonprofits are undercommunicating, not over. That means that the average number profit needs to be communicating more than it is.

And the concern that you are annoying people with asking for money comes from an oft-quoted and concerning inferiority complex from the nonprofit.  We have to believe that we are good enough to merit a gift and making an appropriate ask to be effective.  We want to give our donors an opportunity to be a part of something powerful and transformative.  Remember that if we do our jobs well, donating to our organization is a positive experience.

So how would you test whether you are communicating often enough/too often?  The first step is to figure out where you are as a control with a cross-medium communications calendar.  This is easy said than done, but it’s a necessary first step.  This need not be perfect; as you are going to want to have some communications that are timely and focused on current events, you may have to have some placeholders in place that simply indicates “we’re going to email something here.”

Then split test your file and test, so that part of your file gets X communications and another gets X plus or minus 1.  I’d suggest plus.  Then measure the total success of the communications.

I once helped lead a test where we took mail pieces out of our schedule during membership recruitment.  We would send a piece or two, then wait to see if those donors would donate before sending to them against to make sure that we were addressing them properly as either a renewed donor or as someone who has not yet renewed.  Each individual piece in the resting membership series had a significantly better ROI and better net than the more consistent appeal series.

Yet the appeal series brought in more money for the organization and the mission overall.  I would argue, as I did at the time, this is the actual important metric.  If you want to look at metrics like ROI or response rate, your best opportunity is to send one letter to your single best donor – you’ll get a 100% response rate and ROI percentages in the tens of thousands or more.

But for real life, the goal is more money for more mission.  So overall net is the metric of choice.

The easiest campaigns to add to are the ones that already have a multistage component.  Let’s say you have a matching gift campaign that goes mail piece 1, email 1, mail piece 2, email 2 (with two weeks between each).  A way of testing up would be to look at doing mail piece 1, email 1 + mail piece 1.5, mail piece 2 + email 1.5, email 2 (so there’s still two weeks between each set of communications, but they double up in the middle).  That would be adding a mail piece and an email and if you test both of these with net as your goal, you will have a better framework for the campaign in the following year as well as for additional testing throughout the year.

With email only campaigns, there’s another way of checking whether you are over-emailing your file – looking to see if your total opens and clicks fall.  There is a point at which open rates and click rates will begin to fall; however, you shouldn’t worry too much until adding another email not only lowers your open and click rates but lowers your total number of opens and clicks (similar to a focus on total net, rather than net per piece).

This tipping point in email is probably well past where you think it is.  Hubspot did a study of emails per month on both open and click-through rates.  The sweet spot with the highest open and click rates was between 15 and 30 email per month.

That’s right – opens and clicks went up until you got in the range of daily emails.  Things went downhill after 30 days.  So if you are sending more than daily emails (on any day but December 31 or the last day of a matching opportunity), you might be emailing too much – so take that as a cautionary tale for the .0001% of you who are doing this.  For the other 99.9999%, hopefully this will give support for the business case for testing up on your emails.

There are three tricks to cross-platform testing:

  1. There is a whole science of attribution testing. If you have the ability to look at this literature and your data systems will support this, go for it.  However, most organizations of my experience don’t have all of their data in the same place initially, making this exceedingly hard.  Thus, this sort of testing up/down for cadence should look at sources of revenue by audience test panel rather than through what medium the donation is made.  You may be surprised how much adding a mail piece increases your online revenue or adding a telemarketing cycle boosts the mail piece.
  2. Unlike with strictly piece-based attributes, I’d argue you have to test every cell here because there are interactions among the means of communication. It may be that mail + mail is better than mail and mail + phone is better than mail, but that when you have mail + phone + mail, you have diminishing returns that don’t compensate for doing both mail pieces.
  3. You will have to be vigilant about the creation of your testing cells. ft_15-07-23_notonline_200pxAs much as you would like to call everyone who has a phone number or email everyone who has an email address, and use those who don’t have a phone number or email on file as a control audience, those are different types of donors.  Pew has a great summary of the non-Internet users of the US at right.  Even if you looked just at the age and income variables, you can see how this would make your control audience look very different from your non-control.In reverse, 66% of 25-29 year olds live in houses where there is no landline, compared with 14% of 65+ year olds, according to the National Center for Health Statistics.

    So, if you think of the average person for whom you have a phone number, but not an email address, that person looks very different from the one where you have an email address, but not a phone number.  Thus, you have to either control for all demographic variables in your assessment (hard) or split test people by means of communication that you have available. (marginally easier)

Thanks for reading and be sure to let me know at nick@directtodonor.com what future topics you’d like to see.

Testing beyond individual communications

Testing for smaller lists

One of my favorite non-Far Side single panel cartoons is

miracle

 

This is often what it feels like to be a small nonprofit or small division of a nonprofit.  You know exactly what you would do if you were big.  But you aren’t (yet).  And absent that miracle in the middle, you aren’t going to be there soon.  It feels like a Catch-22 – you aren’t big enough to test, but you aren’t going to enough to test unless you test.

A lot of people have this problem.  One of my favorite conversion sites, unbounce.com, recommends that you have 1000 conversions per month to do A/B testing.  That takes a large nonprofit to accomplish.  Like the Oakland As in Moneyball (both book and movie are recommended), you have fewer resources, so you are going to have to be smarter than your competition other worthy causes.  Here are some tips on how:

Learn what’s important first: Before you do your first test with online traffic, look at your analytics reports (do you have Google Analytics on your site?).  Where are people bouncing from your site?  Where are they dropping out of the donation process?  What forms aren’t converting?  You may be able to do more with one-tenth the traffic or donor list if you are testing the things that will matter to you.

Steal from other people first: There are some things that are almost immutably true.  Requiring more information on a form means lower conversion rates.  Having a unique color for your donate button that stands out from the other colors on your Web site will increase clicks.  Using a person’s name, unless it’s in a subject line, will likely increase response rate.  I commend the site whichtestwon.com to you.  I’ve had the privilege of presenting at their live events and the type of information that comes of them in terms of what others have tested first will save you time and money on things you can do, rather than test.

Go big: I’ve talked about things like envelopes and teasers and things to test.  If you don’t have a large donor or traffic base, ignore that.  You want to be testing audience and offer – the things that can be global and game changing.

Test across time: If you are testing an audience, an offer, or a theme, that doesn’t have to be accomplished in one piece or email.  Rather, you can test it over a year if you want.  Let’s say you want 25,000 people in each testing group, but only have 3,000, you can get a similar feel for the response to large-scale changes over nine pieces, rather than testing it all in one.

Require less proof: Chances are you are used to doing more with less already.  If you are Microsoft, you can run your test until you get 99.9% certain you are correct.  You should be willing to be less certain.  Some nonprofits choose 80% certainty as their threshold.  Even 60% can give you directional results.  Bottom line, this is a restriction you may be willing to relax.

Test cheaply:  Testing direct mail and telemarketing is expensive.  You want to do your learnings on your site with Google Analytics and either Google’s optimization tool or Optimizely, in email, or on social media.  I would go so far as to say that even larger nonprofits don’t want to test an envelope teaser that they haven’t already tested as a subject line to see if it grabs attention.  Survey tools like SurveyMonkey or Zoomerang can also help you pre-test your messaging either with your core audience (free) or with a panel of people who fit your demographic target (cheap, if you can keep your number of questions down).

Get testing subjects cheaply: I know it sounds like I’m in Google’s pocket, but they have many nonprofit solutions at the right price for smaller nonprofits – free.  One of these is Google Grants, which allows you to use their AdWords solution with in-kind donated advertising.  Get this now, if you don’t have it.  We’ll do a whole week on AdWords at some point, but in the meantime, if you have a form you are testing and you don’t have enough traffic, pause all of your campaigns except the ones directed to that form.  You will get your results a lot more quickly.

Test by year: It’s not an ideal solution, but if you test one thing one year and then another tactic the next year at the same time, you can get a gut feeling as to what is more effective.

Avoid word salad: Consider the time on West Wing (which I remember better than many real-life presidencies) when the Majority Leader who was running for president was asked why he wanted to be president:

 

“The reason I would run, were I to run, is I have a great belief in this country as a country and in this people as a people that go into making this country a nation with the greatest natural resources and population of people, educated people … with the greatest technology of any people of any country in the world, along with the greatest, not the greatest, but very serious problems confronting our people, and I want to be President in order to focus on these problems in a way that uses the energy of our people to move us forward, basically.”

Good writing converts.  Good writing mandates active verbs and few adverbs (my personal crutch).

 

“It’s an adverb, Sam. It’s a lazy tool of a weak mind.”
— Kevin Spacey in Outbreak

Good writing ignores the mission statement, discards stats, eschews your jargon, and touches you in a very personal place.  OK, perhaps not that active a verb.  I’m talking about your heart, you sicko.

Don’t test good copy versus bad copy.  Come up with your best before you test, lest you learn what you already should know.

Conspire.  You have coalition partners and people who are in similar positions around you.  Get out into the big blue room and see what they are doing.  And be generous with your own tests – deposits in the karma bank rarely fail to pay interest.

Finally, embrace the advantage of being small.  As a smaller nonprofit, you are going to have to be smarter about testing than bigger ones.  But you will be able to swing for the fences while they are still trying to get their different versions of teaser copy through the Official Teaser Copy Review Subcommittee.  You can be bold and find your voice honed to what works, rather than what your boss’s boss’s boss’s brother-in-law said you should try out over Thanksgiving dinner.

Tomorrow, we’ll go into some testing modalities that allow you to test things beyond a single communication or theme.

Testing for smaller lists

How to test more than one variable at a time

Yesterday, I articulated why you would want to test more than one variable at a time.  The trick is testing multiple things and still getting results you can act on.  Here’s how to do this without taxing your math skills and without exceeding the capacity of your list.  We’ll talk about mail testing as online testing is far easiest, given instant results, lower testing costs, the ability to see what specific calls to action (i.e., links) are getting the most action, and the ability to sequence tests (e.g., testing a subject line earlier in the day, then rolling out with a winning subject line and two test copy versions in the afternoon).

Let’s look at a simple testing matrix, where you would want to test two variables (let’s say envelope and copy) with two options each.  It’s easy enough to do the multiplication and say that, for a “test one variable at a time” approach, you would want four test cells; let’s call them A1, A2, B1, and B2.

Copy 1 Copy 2
Envelope A
Envelope B

By testing every option, you get the best testing results.  Likewise, you can see intuitively that no two testing cells will give you full results: if you tested A1 and B2, for example, if B2 did better than A1, you wouldn’t know whether it was envelope B or copy 2 (or both) that caused the lift.  And any other combination would not test both variables.

But three testing cells will get you the data you need.  Let’s say you did not test cell B2 and get the following results:

Copy 1 Copy 2
Envelope A  $.50  $.75
Envelope B  $1.00  ?

Here, you can see envelope B did 50% better on a gross per piece basis than envelope A when it had copy 1.  You can also see that copy 2 did 100% better than copy 1 when in envelope A.  From this, you can deduce that B2, which wasn’t tested, would have the best results at about $1.50 gross per piece.  This is because that is both 50% better than B1 (the result of the envelope lift) and 100% better than A2 (the result of the copy lift).

And if you ever do have results this clean, please let me know about them.

This is just a 2 x 2 matrix, but it illustrates a point that will apply to greater multivariate testing – any time you have significant results in three out of four of a 2×2 matrix, you can deduce the fourth.  I call this the testing L because of the shape it makes on your testing matrix.  All you need to do is to iterate a traditional A/B test and you can get a robust testing strategy.Let’s say you want to test three envelopes and three sets of copy.  Here you would only need five intersections instead of the nine you might expect:

Copy 1 Copy 2 Copy 3
Envelope A  X  X
Envelope B  X  X
Envelope C  X

You can see how the L technique would help you determine the remaining blank cells one at a time.

What if you wanted to add another dimension?  In addition to your three envelopes and three sets of copy, you’d like to also test three ask strings, for example.  Here, you’d only need to test both ask strings as a replacement for the existing ask string.  So to the above testing matrix, where you are testing:

A1$ (where $ is the existing ask string)
A2$
B2$
B3$
C3$

Adding, say, A2+ and B3% would give you an L over this new dimension (remembering that you were able to solve for the entire testing matrix in two dimensions).  If you were to try to test all 27 possible combinations with a quantity of 25,000 each, that would be a prohibitive 675,000 mailings.  This makes it a much more manageable 175,000 people.

This assumes, of course, that you want 25,000 people in each test panel.  There are some more advanced statistical techniques that would allow you to mail a larger number of intersecting tests.  These make it so that each cell is no longer independently projectable, but can still give you aggregate results.  I personally like having some experience at the cellular level, especially if there is a strong control already in place (in which case I will weigh the testing cells toward the control package’s attributes), but this is a possibility.

But even 25,000 is too large for some.  In fact, one of the more frequent questions I hear is how to do testing when you have a small list.  That’s the topic for tomorrow; then, Friday, I’ll talk about cross-platform and cadence testingcross-platform and cadence testing – testing that goes beyond the “here’s a communication; here’s another communication” type testing.

How to test more than one variable at a time

“Test one variable at a time” is a lie

It’s not an intentional lie and its heart is in the right place, but it’s wrong nonetheless.

The reason people will tell you to test only one variable at a time is that you want to be able to isolate why what happened happened.  So, for example, if you changed the teaser on an envelope and sent it to an equivalent audience at the same time with the same contents in the envelope, if there was an increased response rate, that is a winning test because of the envelope.

This is a fine way to test if there’s only one thing you want to learn at a time.  You can refine your program this way, getting better and better.  This is the direct marketing equivalent of kaizen – the practice of continual improvement popularized in manufacturing, but now applies to much strategic thinking.

But there are some significant problems with this:

  • You can’t test synergy between variables. Let’s say you have a subject line you’d like to test.  However, it may work better with a different version of your email; after all, you wrote the original subject line for this email – the new one may not fit as well.  Testing one thing at a time may not allow us to test the most coherent versions of each of your offers.
  • It can lead to small ball, where you only test things at the most granular level. In his book Fundraising When Money is Tight, Mal Warwick talks about testing teaser copy 25 different times with almost as many clients.  Of the tests, 21 – 84% — showed no difference (and these were at quantities that would have shown a difference had there been one).  This is an OK learning if you can learn other things from the package as well, but if that’s all you learn, you’ve investing in testing without any return more than four out of five times.
  • It can’t make significant leaps forward. Let’s say you have a control piece in decline.  You know it needs to be replaced because of its response rate.  Or maybe, in a more positive outlook, you’ve accomplished the goal you were striving toward.  Either way, the way to get rid of this piece isn’t to test the envelope one year and the response device the next year – you have to test more than one variable at once.

All in all, this violates a rule you should have for yourself – to learn as much as possible whenever possible.  Think of it as if you were trying to reach the highest elevation on Earth.  If you had the rule of “go up from where you until you can’t go up any more,” you will reach a peak higher than you are currently, but by no means the highest point possible.  Similarly, if you had the rule “climb to the highest point you can see, even if it means going down a bit,” you will be doing better and getting higher than you were, but this iterative process will not lead to you having to don an oxygen tank anytime soon.

So it is with testing.  Testing one variable at a time will get you closer and closer to your local maximum, but not the global maximum.

But the basis of the argument for variable isolation is not untrue.  You still need to be able to figure out what works and what doesn’t.  The trick is sussing out what did what in your test.  That’s what we’ll cover tomorrow: how to layer multiple single tests to get results you can act on.

“Test one variable at a time” is a lie

The basics of direct marketing testing

It’s testing week here at Direct to Donor and we’re going to start with some simple principles, as is our Monday pattern.  This is the first of many testing weeks, given the importance of the topic.

Unless it doesn’t work, in which case we might never speak of this again.

This is a great segue to the first rule of testing: that which works, works.  That which doesn’t work, doesn’t work.

I know, it sounds like I’m stating the obvious, but there’s an oft-forgotten conclusion from that, which is that if you aren’t willing either to roll out with it or to scrap it, you shouldn’t test it.

Two different much-beloved CEOs have expressed to me over the years that the single best predictor of whether a mail piece would succeed or not is whether they liked the piece.  If they liked it, it wouldn’t work; if they didn’t, it would.

Part of why they were and are much beloved is that it doesn’t matter whether they liked it or not – it was all about whether a tactic worked successfully.  The mantra that goes with this is:

Or, as I would put it, it doesn’t matter if the source if the quote is a damn dirty Commie if it’s a good quote.

That said, there are some things that are beyond what your organization will accept.  For example, an environmental organization shouldn’t use paper in their mailings from non-recycled old growth forests.

If that’s the case, then, don’t test it.  A goal of testing is to find something that will be able to use in some larger capacity in the future.  If that isn’t possible, it eliminates the need for the test.

That said, the list of sacred cows should be as small as possible.  You’ll hear me say test everything and I mean it – other than things that are untenable, experimentation is the best and truest way of learning.  Donor surveys are great, but they show what people think they would do and how they would react versus what they do do; it’s important not to confuse words and deeds.

There are things that are far more important to test than others.  That’s why it’s important to start with a hypothesis and to test the fundamentals first.  I love tests that nibble around the edges as much as the next person – if you can show me how to improve a piece with a different teaser and pick up 1.4% additional in response rate, I’m game.  But the most important tests that you run will be about fundamentals – who are the people that I should be talking to and what offer am I giving them.  Having a hypothesis will help with this and the broader it goes is better.  A hypothesis like “I believe our lapsed donors will respond best to the means and message that brought them into the organization initially” works well because it lends itself to testing on various platforms with a variety of tactics and a success could mean large things for the organization.  One like “I believe a larger envelope will work better” is restricted to one piece at one time and thus has limited ripple effect.

Finally, please learn from my mistakes and don’t test something by rolling out with it.  I did this in my first year of nonprofit marketing.  We had three underperforming mail pieces and I decided to replace them with new packages I had dreamt up.  Thankfully, I was lucky – the success of one of my pieces paid for the abject failure of the other two.  If I hadn’t been lucky, I might be blogging on effective panhandling tips right now.  You don’t want to put your nonprofit in a position where hitting goals and achieving mission is based on your hunch.

This may be a bit of conventional wisdom.  However, tomorrow, there is one piece of conventional testing wisdom that needs to be taken out back and shot for the benefit of your testing program.

The basics of direct marketing testing