Political Data is Everywhere — But What Does It All Mean?

July 17, 2015 Christian Madsbjerg

By Farai Chideya

You may not have tuned seriously into the 2016 race, but it has already tuned into you. Imagine that a little buzzer goes off every time you generate a piece of data. You’re in the checkout line at the drugstore and hear a variety of chimes based on the products you purchased, which were tracked through your loyalty card, with the information then resold without your explicit knowledge. Perhaps you changed your Facebook profile to a rainbow — big buzz — or joined a public group of concerned parents. Or maybe you’ve had what data giant Experian calls a “life-event trigger,” so a bell sounds because you’ve filed a change-of-address notice for a move to a new house. Of course, that buzzer would continue to go off while you were asleep, as your friends pile up “likes” and “favorites” across social media, some of which can then be cross-referenced back to you. Then imagine that the political campaigns were collecting all those small buzzes. Because they are.

As the rock-em, sock-em of 2016 begins, campaigns are also lining up their data analysts. In 2012 the two major parties spent about $13 million on data mining (while the individual campaigns spent untold millions more), with larger outlays predicted this time around. And while we can’t opt out of being a part of political data any more than most people can opt out of having a credit score, we can be aware of how we’re being measured; how our social media behavior and contacts are being used; and how to pay attention to our own interests in a world rife with political messaging. “The more people understand how people are being marketed to through their stuff is important,” says Frank Pasquale, a University of Maryland law professor and the author of The Black Box Society: The Secret Algorithms that Control Money and Information. “There is a problem of context collapse,” he says. “Someone may be a shopper at Whole Foods [and consent to having the store track their data], but doesn’t want to be pegged as ‘an organic shopper’ by political organizations.” We should also question whether all this data collection works — or benefits the democratic process.

Particularly starting in 2008, the campaigns honed the use of microtargeting — reaching out to individual voters with specific appeals based on their demographics. The evolution continues. In the 2016 cycle, according to Tom Bonier of TargetSmart and Ken Strasma of HayStaqDNA, both firms that work with Democratic clients, the changes in data mining will be more incremental. A lot of the work today is being done on social media integration and ad buying in an age of fragmented audiences. Strasma, who served as the targeting director for the 2008 Obama campaign, expects the biggest changes to involve increased speed — updating messaging in real time as volunteers move door to door through a neighborhood, say — and sophistication in dealing with what he calls the “tsunami of data” from social media. “We’re teaching computers to perceive sarcasm,” he explains, in order to gauge opinions posted online. (Right now, for instance, at least as many liberals as conservatives may be posting “I just love Donald Trump.”)

For his part, Bonier sees “actual person-to-person digital targeting” coming online later this summer. “The voter will be matched to Facebook, Yahoo, Microsoft” — linking your real-world identity to your online logins. “It’s not to say this replaces broadcast television. It complements it. But I can’t buy a network news slot and just serve that to you, I have to serve everyone in your media market.” (There is one exception in the television world: D2, created by Direct TV and the Dish Network, has a modest set of subscribers but can directly target ads by customer.)

Put simply, the world of political data collection breaks down along two axes: partisanship and aggregation. One is about who your clients are; the other is about where your data goes even after a specific campaign ends. Some firms serve only Democrats or Republicans, or progressive or conservative advocacy groups. A small number of firms — ones led by the RNC and the Koch Brothers-funded i360 on the GOP-side; and Catalist and NGP VAN on the Democratic side — act as power record-keepers, compiling voter files with information aggregated from many sources, and then selling or licensing that information to other targeted or smaller firms, which may in return share their data with the powerhouse firms. Other political consultants do not share data between campaigns or clients at all.

Despite sometimes breathless press coverage of the industry, it’s hard to prove exactly when data has been responsible for victories. For example, although the Obama campaign in 2012 was credited for the success of its well-organized big-data team, the president had a lead over Mitt Romney for almost the entire race. Even if data had an impact, which it almost certainly did, that does not mean it was responsible for the win. David Nickerson, a political science professor at the University of Notre Dame who was the “director of experiments” for Obama’s 2012 analytics department, emailed to say, “On whether we can quantify the value added of big data, the answer is ‘no.’” Nickerson, who has studied what he and an academic co-author Todd Rogers called “an arms race to leverage ever-growing volumes of data to create votes,” tells me, “The gains in persuasion are much larger for small campaigns (who ironically are least able to take advantage of Big Data), but could still account for 17,000 votes in big states like Florida.”

There has been a drive to add more and more types of information — adding up to a total of 3,000 to 4,000 data fields in some cases — to private political databases. But some experts believe the most useful information still comes from just a few key indicators. “When people read about political predictions based on whether you’re a cat owner or a dog owner, not only does the data generally not provide any utility in what we do, sometimes it doesn’t even exist,” says Bonier. “When you look at the models we build, the most predictive data points are very simple and benign: age, gender, ethnicity or the ethnic distribution of area they’re in; party registration.”

The reputation of political data is both harmed and helped by the “black box” nature of modeling. A firm that, for proprietary reasons, does not disclose its data modeling could claim it was responsible for a victory whether or not that was the case; and at the same time, if the modeling is less invasive than voters think, that’s hard to prove as well.

Focusing on the data can also mean the experts — and their politician clients, and the voters — lose sight of small-d democratic goals. For instance, campaigns may tend to push the buttons they think the voter wants pushed, and by doing so they may not be giving us a full or true picture of their policies. Says Bonier: “Campaigns are in a position to have more meaningful discussions with voters. We can talk to people where they are — for example, to parents about schools and universal pre-K. If you do it right, it’s a two-way conversation, and people are talking back.” But if you do it wrong, you’re just telling folks what you think they want to hear, or crossing into the “uncanny valley” of campaigning, where voters realize they’re being deliberately messaged in a slightly — or extremely — off-putting way.

Steve Adler co-founded the firm that built NGP VAN, but says he’s always considered himself part of the “radical middle”; in 2010 he founded the Republican-affiliated rVotes. He claims his firm provides a platform for campaigns to collect data from volunteers, polls, and fundraising and maximize their value. rVotes was used by Rep. Dave Brat, R-Va., a Tea Party-backed underdog who unseated then-Speaker of the House Eric Cantor in the 2014 GOP primary. Adler cheerfully describes what might be seen as intrusive uses of volunteer efforts. rVotes volunteers can earn rPoints, an internal reward currency, by performing many volunteer actions, such as looking at photos of of Facebook friends and helping match them to rVotes’ registered voter list. “No money is being spent. The activist is being rewarded,” he says. As for volunteers who do door-knocking, they’re given a list of features to note if the owner isn’t home. “If they have a pickup truck with a gun rack, they’re likely R (Republican),” says Adler, “or a chime on their porch with glass beads and orbs, likely L (liberal).” This is all added to the rVotes database of the specific campaign.

Another of the newer firms is NationBuilder, which like rVotes does not share data between clients, but unlike rVotes and most other firms is party-agnostic. CEO Jim Gilliam co-founded Robert Greenwald’s Brave New Films before launching NationBuilder in 2009 with the goal of providing smaller campaigns and activist groups with data tools. (NationBuilder has received venture funding from sources including the Omidyar Network; Pierre Omidyar also founded The Intercept’s parent company, First Look Media.) Gilliam argues that firms that focus on massive data acquisition and predictive analytics aren’t acting in citizens’ best interests. “It’s about using behavioral data to manipulate people to do what you want to do. A/B testing [a key feature used by campaigns including Obama’s, where, for example, the campaign tested a variety of messaging combinations to see which resonated most before settling on one] is the exact opposite of leadership,” he says. “That’s a different approach than listening.”

Of course, partisan firms argue that they are not gathering information for information’s sake, but for democracy’s. John Hagner of Clarity Campaign Labs, a Democratic-affiliated shop, has been working in politics since his youth. “I do this to win because it matters,” he says, “and have no interest in working with firms that don’t represent progressives.”

Some experts believe that excessive reliance on data risks creating new barriers of exclusion. The paper by Nickerson and Rogers analyzes the work of the Democratic firm Catalist in Ohio in the 2004, 2008 and 2012 elections. A 2012 “heat map” of how many contacts the campaign made to Democratic-leaning voters shows low-to-no contact with likely Democrats who also had low predicted turnout. In that case, if those voters did not, in fact, appear at the polls, was that an accurate prediction, or a reflection that they were ignored by get-out-the-vote efforts? Two groups considered vulnerable to being written off as unpredictable or unlikely to vote are low-income individuals who move often, and those younger than 25. Nickerson, however, believes things may change. “As campaigns shift resources away from people who can be counted on to vote, they can devote more resources to engaging communities that are traditionally ignored. If anything,” he says, “microtargeting should help broaden the outreach efforts by campaigns.”

Voters who may be considered low-value targets for outreach may still offer valuable insights to leaders about the direction of the country. If data modeling pushes them — and their core issues — further from the campaign cycle, that could distance politicians from addressing key societal concerns. Data without that kind of context, says Christian Madsbjerg, partner at the global consultancy ReD Associates and author of the forthcoming Sensemaking, is a sort of false idol. “We end up with the kind of political programs that are technocratic — measuring everything. And that’s deadly to democracy,” he says. “The understanding of people’s lived experience is critical to democracy.”

As campaign 2016 unfolds, voters and technologists alike must beware: A data-driven election experience may well enhance the outreach to some voters, but diminish that to others. And those outside politics’ inner circle won’t even know how much data sways each election — or at what cost to our privacy.

This article originally appeared on The Intercept.