JISCMail - PHD-DESIGN Archives

Email discussion lists for the UK Education and Research communities

Subscriber's Corner

Email Lists

PHD-DESIGN Archives

PHD-DESIGN@JISCMAIL.AC.UK

View:

Message:

[

First

Last

]

By Topic:

[

First

Last

]

By Author:

[

First

Last

]

Font:

Proportional Font

		LISTSERV Archives
		PHD-DESIGN Home
		PHD-DESIGN November 2012

Options

Subscribe or Unsubscribe

Get Password

Subject:

Data mining and computer models

From:

Ken Friedman <[log in to unmask]>

Reply-To:

PhD-Design - This list is for discussion of PhD studies and related research in Design <[log in to unmask]>

Date:

Sun, 11 Nov 2012 22:50:09 +0000

Content-Type:

text/plain

Parts/Attachments:

text/plain (77 lines)

Dear Terry,

Again, I've been thinking further on the question of what it is that happens in data mining and computer models for political campaigns. If you read Michael Scherer’s (2012) article again, you’ll see that the article also describes this as data mining. While analysis and data mining move the campaign beyond intuition as a stand-alone factor in human judgment, the metrics support skilled human judgement. These four excerpts describe processes very much like those I used in the magazine industry in the late 70s and early 80s:

—snip—

In late spring, the backroom number crunchers who powered Barack Obama’s campaign to victory noticed that George Clooney had an almost gravitational tug on West Coast females ages 40 to 49. The women were far and away the single demographic group most likely to hand over cash, for a chance to dine in Hollywood with Clooney — and Obama.

So as they did with all the other data collected, stored and analyzed in the two-year drive for re-election, Obama’s top campaign aides decided to put this insight to use. They sought out an East Coast celebrity who had similar appeal among the same demographic, aiming to replicate the millions of dollars produced by the Clooney contest. “We were blessed with an overflowing menu of options, but we chose Sarah Jessica Parker,” explains a senior campaign adviser. And so the next Dinner with Barack contest was born: a chance to eat at Parker’s West Village brownstone.

For the general public, there was no way to know that the idea for the Parker contest had come from a data-mining discovery about some supporters: affection for contests,small dinners and celebrity. But from the beginning, campaign manager Jim Messina had promised a totally different, metric-driven kind of campaign inwhich politics was the goal but political instincts might not be the means. “We are going to measure every single thing in this campaign,” he said after taking the job. He hired an analytics department five times as large as that of the 2008 operation, with an official “chief scientist” for the Chicago headquarters named Rayid Ghani, who in a previous life crunched huge data sets to, amongother things, maximize the efficiency of supermarket sales promotions.”

—snip—

The new megafile didn’t just tell the campaign how to find voters and get their attention; it also allowed the number crunchers to run tests predicting which types of people would be persuaded by certain kinds of appeals. Call lists in field offices, for instance, didn’t just list names and numbers; they also ranked names in order of their persuadability, with the campaign’s most important priorities first. About 75% of the determining factors were basics like age, sex, race, neighborhood and voting record. Consumer data about voters helped round out the picture. “We could [predict] people who were going to give online. We could model people who were going to give through mail. We could model volunteers,” said one of the senior advisers about the predictive profiles built by the data. “In the end, modeling became something way bigger for us in ’12 than in ’08 because it made our time more efficient.”

Early on, for example, the campaign discovered that people who had unsubscribed from the 2008 campaign e-mail lists were top targets, among the easiest to pull back into the fold with some personal attention. The strategists fashioned tests for specific demographic groups, trying out message scripts that they could then apply. They tested how much better a call from a local volunteer would do than a call from a volunteer from a non–swing state like California. As Messina had promised, assumptions were rarely left in place without numbers to back them up.

—snip—

A large portion of the cash raised online came through an intricate, metric-driven e-mail campaign in which dozens of fundraising appeals went out each day. Here again, data collection and analysis were paramount. Many of the e-mails sent to supporters were just tests, with different subject lines, senders and messages. Inside the campaign, there were office pools on which combination would raise the most money, and often the pools got it wrong. Michelle Obama’s e-mails performed best in the spring, and at times, campaign boss Messina performedbetter than Vice President Joe Biden. In many cases, the top performers raised 10 times as much money for the campaign as the underperformers.

Chicago discovered that people who signed up for the campaign’s Quick Donate program, which allowed repeat giving online or via text message without having to re-entercredit-card information, gave about four times as much as other donors. So the program was expanded and incentivized. By the end of October, Quick Donate had become a big part of the campaign’s messaging to supporters, and first-time donors were offered a free bumper sticker to sign up.

—snip—

The magic tricks that opened wallets were then repurposed to turn out votes. The analytics team used four streams of polling data to build a detailed picture of voters in key states. In the past month, said one official, the analytics team had polling data from about 29,000 people in Ohio alone — a whopping sample that composed nearly half of 1% of all voters there — allowing for deep dives into exactly where each demographic and regional group was trending at any given moment.This was a huge advantage: when polls started to slip after the first debate, they could check to see which voters were changing sides and which were not.

It was this database that helped steady campaign aides in October’s choppy waters, assuring them that most of the Ohioans in motion were not Obama backers but likely Romney supporters whom Romney had lost because of his September blunders. “We were much calmer than others,” said one of the officials. The polling and voter-contact data were processed and reprocessed nightly to account for every imaginable scenario. “We ran the election 66,000 times every night,” said a senior official, describing the computer simulations the campaign ran to figure out Obama’s odds of winning each swing state. “And every morning we got the spit-out — here are your chances of winning these states. And that is how we allocated resources.”

—snip—

The questions are quite similar. Which gifts will attract subscribers to a specific kind of magazine? Which publisher letter draws more subscribers in a specific group: the one-page letter, the two-page letter, or the four-page letter? Many factors remain the same, as well. It is easier to get renewals from lapsed subscribers than to get new subscribers. There was even an art to knowing how to use a list and when to stop using it – the reason that some magazines send us half a dozen varied repeat offers before stopping is that we are on a list that is stilldrawing a sufficiently large number of new subscribers every time the magazine uses it.

But the machine doesn’t create the models or tell the campaign managers what to do. The campaign experts create and test different models and then decide what to do based on the numbers that emerge from the test.

There are three key differences between this kind of political modeling and the magazine industry.

First, no magazine can afford this kind of depth and accuracy – the budgets are vastly different. You can get vastly better models when you test different approaches 66,000 times a night over nearly a year of campaigning than you can get testing one or two subscriber appeals before deciding to launch or scrub and new magazine – or even testing eight to ten new appeals a year for a successful magazine.

Second, the magazine industry only seeks subscribers. This means there are far fewer kinds of appeals and parameters to test. But factors affecting the relationship between a magazine and its subscribers is different to the more complex series of factors between a political leader and his or her constituencies –especially a national leader whose appeal and whose resources vary by state, by congressional district, by demographic group, and by other overlapping factors. This is what makes data mining and the “big data” approach so useful, rather than complex feedback loops. The numbers still power experienced human judgment – and human relations are still the key factor.

Only a president with the support of appealing stars such as George Clooney, Sarah Jessica Parker, or Bruce Springsteen could engage and deploy them in the campaign. President Obama’s ability to deploy Bill Clinton as a campaign surrogate made a crucial difference – Clinton remains one of the most popular former presidents in history, and a political speaker whose skills compare to those of Roosevelt and Reagan. President Obama’s ability to delegate the highly popular and widely respected Michelle Obama was a massive factor in his success.

The late Tip O’Neill, former Speaker of the House of Representatives, used to say, “All politics is local.” The difference is that a local alderman knows every constituent. Half a century ago, an extremely skilled member of the House of Representatives might still master the “local” personalities and politics of his or her district. Today, that’s impossible, and it is entirely beyond the grasp of a senatorial or presidential candidate. Data mining enables a candidate with multiple constituencies and a skilled team to make the campaign local.

Computers crunch numbers. Computers can’t grasp or model local skills and they lack political savvy. Members of the campaign team need the skill and expertise to understand what it is they must model. They must know how to interpret the results. And they must apply what the models reveal to the live world of human interaction.

Third, of course, is another key difference between marketing magazines and politicalcampaigns. It is a relatively little-known fact that magazines pretty much use up all subscription revenue in getting subscribers, then printing and delivering the physical magazine. The real profit in the magazine industry – and the factor that supports editorial and design work – is advertising. With respect to the real profit in the magazine industry, publishers don’t sell magazines to subscribers. They sell subscribers to advertisers by selling the attention of key demographic groups to advertisers. In a sense, the personal relations between a magazine and its key constituency is the relationship between the magazine’s advertising staff and its ad buyers. The roles of Dan (Dennis Quaid) and Carter (Topher Grace) – and their relations with potential advertisers – in the movie In Good Company give you a sense of these issues.

In contrast, politics is a network of complex, multidimensional human relations. Data mining allows us to model and predict those aspects of the relations that resemble a verycomplex magazine subscription. We can model ways to influence voter preferences, ways to raise funds, ways to get the right voters to the polls. But we don’t model them all at once. And at the end of the day, computers are not modeling complex, interactive socio-technical systems or “telling” the campaigners what to do. The computers mine data to model strictly defined likely behaviors with respect to other strictly defined sets of demographiccriteria.

What is new and revolutionary is the scale on which we can mine data. Using big data turns the United States presidential campaign into 50 local campaigns by state, 436 local campaigns by congressional district, and over 3,000 campaigns by county. Beyond this, it helps politicians to organize several hundred “local” campaigns by specific demographic groups – married lesbian voters with children, white men with a PhD degree, black women with a PhD degree, union workers, union leaders, working scientists, Hispanic surname political leaders, movie stars inclined to support Democratic candidates, and so on.

I recognize and agree with the importance of computerized data mining in the recent campaign. Where we disagree is that the computers do not model complex looped socio-technical systems. They model different runs of carefully defined kinds of human behavior in carefully defined social groups, using the power of the computer to test the models across larger ranges and combinations of groups and demographics than has been possible in the past. From this, of course, both predictable and surprising campaign opportunities emerge, suggested by the outcomes of different runs.Analyzing, understanding, and applying the results of those models remains an art of political skills and judgment.

Yours,

Ken

References

Scherer, Michael. 2012. “Inside the Secret World of the Data Crunchers Who Helped Obama Win” Time Magazine. November 7, 2012. URL:
http://swampland.time.com/2012/11/07/inside-the-secret-world-of-quants-and-data-crunchers-who-helped-obama-win/

-----------------------------------------------------------------
PhD-Design mailing list <[log in to unmask]>
Discussion of PhD studies and related research in Design
Subscribe or Unsubscribe at https://www.jiscmail.ac.uk/phd-design
-----------------------------------------------------------------

Top of Message | Previous Page | Permalink

JiscMail Tools

Files Area | help

RSS Feeds and Sharing

Search Archives

Advanced Options