The Failure of Simple Narratives

Approximately Correct is not a political blog in any traditional sense. The mission is not to prognosticate elections, like FiveThirtyEight, nor to revel in the political circus, like Politico. And the common variety political writing seems antithetical to our goals. Today, political arguments tend to follow an anti-scientific pattern of choosing a perspective first and then selectively reaching for supporting evidence. It’s everything we should hope to avoid.

But, per our mission statement, this blog aims to address the intersection of scientific and technical developments with social issues. And social issues -the economy, the environment, healthcare, news curation, et al. – are necessarily political. Moreover, scientific practice requires dispassionate discourse and the ability to change one’s beliefs given new information. In this light, the abstention of scientists from political discourse seems irresponsible.

[An aside: Not all political issues are scientific or technical. The relative value of free speech vs the danger of hate speech may be an intrinsically subjective judgment. But many issues, such as global warming, explicitly exhibit scientific dimensions.]

Technical developments can necessitate policy shifts. Absent the capacity to warm the planet or the ability to detect such warming, one couldn’t justify strong reforms to energy policy. Additionally, absent scientific understanding of the likely effects of policy, one cannot argue effectively for or against them. So sober scientific analysis has a role to play not just in evaluating policies, but also in evaluating individual arguments.

Machine learning and data science interact with politics in a third important way. The political landscapes of entire nations are immense. Take last night’s presidential election for example. Roughly 120 million people voted in 3,007 counties, 435 congressional districts and 50 states. Hardly any citizens have visited every state. Not even the candidates could possibly visit every county. Thus, our sense of the nation’s pulse, and our narratives regarding the driving forces in the election are ultimately shaped by a mixture of second-hand accounts and data science (as by extensive polling).

Simplistic Narratives

Simplistic narratives and data science play off of each other. Narratives influence the questions that pollsters ask. And each poll result invites simplistic analysis. In the remainder of this post, without expressing my personal opinions, I’d like to give a dispassionate analysis of several popular stories that have risen to prominence during this election, sampled from across both the Democratic-Republican and establishment/anti-establishment divide. I choose these narratives neither because they are completely true nor completely false. Each presents a seemingly simple thesis that belies more complex realities. To be as even-handed as possible, I’ve chosen one each from the Clinton-learning and Trump-leaning narratives.

Hillary Lost Because of Sexism [pro Clinton]

The role of sexism in American politics is evident. Out of all 44 presidents-elect, all 44 have been men. Before Hillary, all of the major party candidates were men. While this represents a trend, common sense coupled with the tenor of the political rhetoric against Hillary Clinton suggest that gender played some role. Trump’s assertion that Clinton lacked the stamina to be president exhibits clear gendered innuendo.

Given the closeness of the race, one might reasonably conjecture that but for gender, Clinton would have won. The “but for X, Clinton would have won” argument, however, can apply equally to any of the other various reasons that voters deserted her. As examples, one might similarly argue that but for the email scandal she would have won or that but for the revolving door between political leadership and enormous sums earned for paid speeches to Goldman Sachs.

The System is Rigged [pro Trump]

Throughout the campaign a narrative emerged that the system is rigged. This narrative took many forms. Some, such as a faithless elector in Washington suggested that the system had already decided the fate of the election in favor of Clinton. Many expressed this narrative as a justification for third-party voting or for not voting at all. Trump’s campaign consistently asserted that the entire political establishment system is stacked against ordinary voters. He repeatedly accused Clinton in particular, and Congress more generally of being owned by special interests. More disturbingly, he called into question the integrity of vote-counting and thus the legitimacy of the entire election.

The results of the election reveal that he most paranoid of conspiracy theories were false. But the sense among the electorate that politics is stacked against ordinary voters and in favor of special interests has some merit. The longevity of highly unpopular tax loopholes for financiers and corporations together with monopolistic pricing conditions in healthcare evidence the extent to which political apparatus has compromised the interests of ordinary voters. And yet at the same time, the US’s political institutions also execute a number of large, popular programs that many ordinary voters depend upon. Examples include Medicare and Medicaid. Making matters complex, it seems reasonable to surmise that many of the politicians support some policies in good faith and others out of political necessity or financial gain. In short, while the rigged system argument has emerged as a political talking point, it obscures a wide range of beliefs, many of which are false and some of which seem true.

Some more claims [absent analysis]

These are but two of many simplistic stories that have been told to sum up the dynamics of this election, some others are:

The Clinton Foundation constitutes an unprecedented conflict of interest [pro Trump]
Donald Trump’s support owes mainly to xenophobia [pro Clinton]
The American economy is actually doing just fine [pro Clinton]
Donald Trump is a great businessman [pro Trump]
Voters don’t trust Hillary Clinton [pro Trump]
The election demonstrates that democracy doesn’t work

Each story expresses a simple sounding narrative that obscures a much more complicated reality. Does Clinton’s foundation raise legitimate concerns about special access? It seems likely. Does this amount to a greater conflict of interest than the Bush family’s entanglements with defense contractors? Dubious. Does the election reflect the failure of democracy as a system of governance? If you see Trump as an existential threat to the country, then, on the surface, this sounds like a reasonable claim. On the other hand, Hillary won the popular vote but lost the electoral college. And Bernie Sanders may have been handicapped in the primary owing to the anti-democratic institution of super-delegates. If anything, it seems likely that a fully democratic system would likely have elected either Sanders or Clinton.

Machine Learning and Storytelling

The temptation to sum up complicated phenomena with simple narratives is not unique to political discourse. Even in the sciences, we simplify the world in cartoonish terms to express intuition. Sometimes these narratives turn out to diverge from reality. At other times, they express reasonable intuitions, helping to generate hypothesis.

As an example, prior to the invention of tf-idf text processing someone, likely came up with a story, absent evidence, that words which appear only in small fraction of documents might be more informative than common ones. This agrees with some intuition. But it’s only after running experiments and finding that text processing emphasizing rare words leads to improved accuracy on classification tasks that such a story gets taken seriously. Of course, many such stories turn out to be false. For each successful idea, many more fail. And it’s worth noting that sometimes a story could lead to a successful hypothesis and nevertheless turn out to be false.

Both in machine learning, and when studying the composition and behavior of the electorate and the underlying dynamics of society, we hope to gain meaningful insight about the structure of large data-rich processes. But in conventional machine learning practice we can perform experiments.

In politics, on the other hand, our stories about both the sentiment of the electorate and underlying political realities often cannot be translated into verifiable claims and subjected to experiments absent consequences. We might hope that better polling could help to ground and refine these simplistic narratives. But the task of polling, especially with respect to elections, is much harder than practicing machine learning. Instead of working with massive numbers of examples with a comparatively reasonable number of features, pollsters must predict a single outcome with a massive number of features.

The Role of Scientists in Politics

Some of the difficulty of understanding both the dynamics of government and the dynamics of the electorate is intrinsic. And surely, some of the conflict reflects irreconcilable differences of values. But given the role of science and technology both in giving rise to political issues and in influencing our perceptions, we ought to consider what the role of scientists might be in political discourse.

As a starting suggestion, here are several ways scientists might be able improve the state of political discourse:

Clearly indicate, with impartiality, which are issues are matters of fact and which matters of opinion or unknown truth content
Provide impartial analysis of the logical strength of specific political arguments
Work to improve polling methods to better calibrate our perceptions with reality
Facilitate technology to enhance the transparency of government, helping to reduce mistrust owing to opacity
Study the ways that technology itself (as by filter bubble) might create new problems

Surely there’s more to be done, and deeper conversations to be had. Contributions and suggestions on the topic are welcome.

Disclaimers

This post expresses my perspective alone and does not speak collectively for other present or future contributors.
I supported Bernie Sanders in the Democratic primary and voted for Hillary Clinton in the presidential election.

Zachary C. Lipton

Author: Zachary C. Lipton

Zachary Chase Lipton is an assistant professor at Carnegie Mellon University. He is interested in both core machine learning methodology and applications to healthcare and dialogue systems. He is also a visiting scientist at Amazon AI, and has worked with Amazon Core Machine Learning, Microsoft Research Redmond, & Microsoft Research Bangalore. View all posts by Zachary C. Lipton