Understanding Election Data

I am not a political scientist, historian, nor a social scientist. I am a computer scientist and cognitive scientist. And, like most of the country, I was surprised by the election results.

One thing that computer scientists study is process, and I have been looking at the election process. As a cognitive scientist, I am also interested in patterns of behavior, motivations, and decision making.

I see many different theories attempting to explain what happened (some theories from friends, some from the media), and what to adjust (in oneself and the system) going forward. I have been thinking about these stories.

First, there is a discussion of the accuracy of the data. Was the data was right or wrong? Of course, polling people ask them to introspect and report back who they will vote for. For a large segment of the population, this is a no-brainer. But there are some people that voted for Barack Hussein Obama and also voted for Donald J. Trump.

Was the polling data wrong for these people? Perhaps, but maybe the pollsters were just asking the wrong questions. We definitely need to make some changes to the poll-predict loop to get a better handle on what is happening. Is Nate Silver one of those elite scientists that is always wrong? Should we throw the science out with the bathwater? We should continue to try to understand how people make decisions. But perhaps thinking of the outcome as a “probability” with a real-value number with a decimal place of accuracy is very wrong headed. What did it mean when Clinton had a 75.2% chance of winning? In the end, it meant nothing. But I think we can make better predictors.

But how can we predict people that can vote for Obama one election, and Trump the next? How does race play into this question? Could Hillary have done something different? Why did nothing Trump did seem to matter? Should we get rid of the electoral college? Should we protest in the street? Should we help President Trump be successful on any program?

All good questions. In thinking about how I would answer these, I was reminded about a professor that claimed that he had a model that predicted that Trump would win the election. In fact, he made his prediction even before March 2016. He calls his system the Primary Model because it is based information from presidential primaries. Since the election, the professor (Helmut Norpoth of Stony Brook University) has cleaned up his website, and now has a nice page at primarymodel.com/2016-forecast-full.

I read his paper back in March. I dismissed it fairly quickly. It didn’t have any data. It wasn’t peer reviewed. It didn’t have any equations. The “Prediction Formula” for the model was expressed in a very odd manner, sort of half spreadsheet, half something else. I tried to make a stab of turning it into a reproducible Python program in a Jupyter notebook.

Screenshot from 2016-11-10 12-48-55

Professor Norpoth claims that his “model” has predicted the last 5 out of 5 elections, and is accurate for all those prior to 1996 (and for which we have primary data). I didn’t play with the code much. It wasn’t really a model, but an equation that separates winners from losers in past elections. There are many ad hoc values, but I did find it interesting that such a line could be drawn between winners and losers. (This is often the task for machine learning algorithms.)

But I believe that there is something for Nate Silver to learn from this model: it takes into account the past election cycle. There is a built-in “backlash” from the previous election cycle. Van Jones called it a “whitelash” this time around. But it doesn’t matter about the details of who is running, their race, their gender, or how tall they are. It just matters who one the last election.

This is what some would call an emergent pattern. Each voter appears to be making independent choices each selection cycle, presumably based on policy, or some criteria. But that can’t be if there are higher level patterns. We need to take such long term data into account. We don’t understand emergent phenomena very well. Consider “life” and “consciousness”. Both emergent phenomena. For election predictions, we need to acknowledge that this pattern exists, and needs to be added to the models.

Where to go from here? This information makes be feel somewhat better. Trump really could do just about anything, and it wouldn’t have mattered at one level. Clinton was swimming against the tide. Did race play a part? Most certainly yes, but we need to understand the long term election voting patterns too.

Should we help President Trump where we agree with him? Should we let him tweak Obamacare and let him call it Trumpcare? I say “yes” because we can still help people. And, it doesn’t really matter if we help him (and the Republicans) be successful. He’ll be swept away in the next backlash.

I agree with Michael Moore on a good many things. But sometimes he is just wrong. Consider his 5 Things To Do Now. Here are mine:

  1. We need to work with the Democratic Party. This isn’t anyone’s fault.
  2. Don’t fire the pundits, researchers, etc. Science works by making mistakes, and making better models. Moore sounds anti-scientific sometimes.
  3. Don’t fight like the Republicans fought Obama. That doesn’t hurt anyone except the people. Take the high road.
  4. Acknowledge that there will always be a backlash
  5. You can say that Clinton won the popular vote, if only to prevent anyone from saying “mandate”.

This doesn’t mean that nothing matters, nor that we should do nothing. We can still make a difference! Tim Burke has written on how to organize to make a difference. To address these emergent swings, the next time we elect a Democratic President, we should immediately start working on which candidate will be the “backlash” candidate. It might be a Democrat, or it might be a Republican. We just need to make sure that it isn’t another Trump.

Leave a Reply