Did Nate Silver get the 2008 election right?
A more interesting question than whether he got 2016 wrong.
The biggest and most well-respected name in forecasting is Nate Silver, formerly of FiveThirtyEight and now on his own Substack, the Silver Bulletin. Silver, through his blog and his presence on Twitter, is pumping out a ton of material about the upcoming election and what the latest polls are telling us. And that means that it’s time for a now-familiar ritual. Any time Silver posts something that someone on Twitter disagrees with, his haters will post the following image:
He got 2016 wrong, so his forecasts aren’t worth shit. Checkmate.
The next step in the ritual is that, whenever anyone criticizes Silver like this, the contingent of smart and reasonable people will scoff at the rank innumeracy of the unwashed masses. Silver gave Trump a 28.6% chance of winning, but that’s obviously consistent with a Trump win. Things that have a 28.6% chance of happening happen 28.6% of the time, don’t you know! Silver’s critics really need to get a middle-school-level education in probability.
My take is that everyone participating in this ritual is wrong, but my sympathies lie somewhat more with the unwashed masses than the smart and reasonable people. Silver really does have a pretty good forecasting record, in a sense, but I think it’s also pretty clear that he got 2016 wrong.
For those who don’t think Silver got 2016 wrong, it’s worth disentangling their line of defense a bit. Silver wasn’t wrong in 2016, the thought goes, because he didn’t actually predict that Clinton would win. He predicted that Clinton had a 71.4% chance of winning, which is not the same thing. If he’d predicted that Clinton would win, that would be one thing. But he did no such thing. He just gave a probability.
For analogy: I’m about to draw a card off the top of a well-shuffled 52-card deck. You ask me if the card I’m about to draw will be a non-face card (A through 10). I reply that there’s a 10/13 (76.9%) chance that it will be a non-face card. I then deal out the jack of diamonds. Was I wrong? No! I didn’t say that a face card wouldn’t be dealt. I just told you a probability, and I wasn’t wrong about the probability. I did the math right. When Silver said that Hilary had a 71.4% chance of winning the election, he was just doing the same thing as I was doing in this case. And just like I didn’t give the wrong answer when I said “76.9%,” Silver didn’t give the wrong answer in 2016.
But the problem with this defense is that it is based on the fact that, in the case of the cards, I didn’t actually answer the question you asked. You asked what would happen. But I didn’t tell you what would happen. I gave you a probability, which is not the same thing as a prediction. If I’d predicted that the top card would be a non-face card (with 76.9% confidence in that prediction), then I’d have been wrong. But I didn’t do that. I deflected, gave a non-answer, and thereby isolated myself from the possibility of being wrong. No matter what came off the top of the deck, I’d have been right, because no matter what came off the top of the deck, I did the math right.
Similarly for Silver: he wasn’t wrong in 2016 because he didn’t actually predict that Clinton would win. We asked him who would win, and he didn’t know the answer (because no one knew the answer; knowledge of the future is really hard!), so he answered a different question, one that he could answer by doing math on a set of polling data. He got that question right (maybe1) but that wasn’t the question we asked. We asked for a prediction, and he gave us a number.
Now all of that is, more or less, intellectually defensible. Giving a probability rather than a prediction is a coherent thing to do (maybe). I just ask that Silver’s defenders be consistent about this.
That brings me to the title of the piece. In 2008, Silver got famous by correctly predicting the outcome of the presidential election in 49 states, missing only on Indiana, and correctly predicted the outcome of all Senate races. But he didn’t actually predict the correct results of those elections! As we just established, Silver isn’t in the business of predictions, he’s in the business of probabilities. He “predicted the correct result” only in the sense that he gave the winning candidate a greater than 50% chance to win in (almost) every case where they did win. But if that counts as making a correct prediction, then Silver incorrectly predicted that Clinton would win in 2016. I’m willing to take Silver and his defenders seriously when they argue that he didn’t predict that Clinton would win in 2016. But if we accept that logic, then Nate Silver has never made a correct prediction in his life, because he’s just not in the business of making predictions. He’s doing something else. Perhaps he’s doing that other thing really well. But that other thing is not predicting.
Or maybe it is. This is why I tend to side with the unwashed masses whenever this stupid debate happens. Silver and his defenders are willing — eager, even, — to say that he made a correct prediction when his greater-than-50% probabilities happen. And if Clinton had won, with a map that looks just like the one in the image at the top of the piece, I guarantee that Silver et al would loudly and proudly claim that he’d correctly predicted 2016. If you crow about your accuracy when your greater-than-50%s turn out to be true, but then mumble about probability when your greater-than-50%s turn out false, you’re full of shit.2
So next time you see the ritual enacted, and Silver or his defenders insist that he didn’t get 2016 wrong, you should ask whether he got 2008 right. Has he gotten anything right? Has Nate Silver ever actually made a prediction? I’m not sure he has.
There are a huge number of differences between the case of the deck of cards and the case of elections, and those differences matter. When we’re talking about decks of cards, we can just look at the number of cards in the deck and the number of cards that don’t have paint on them, and compare the ratio. There’s no such comparison we can make in the case of election forecasting. I won’t get into the weeds on different theories of probability and what they say about these different cases (although I might do so in a follow-up if there’s any interest in my doing so). Suffice it to say that probabilistic election forecasting is extremely different from evaluating the odds in a game of poker.
Silver’s considered response to this is that he’s a good forecaster not because he gets greater-than-50s correct most of the time, but because he’s well-calibrated. But while appeals to calibration are common, it’s a really weird and bad measure of probabilistic accuracy, for reasons I laid out here. And even if we take the appeals to calibration seriously, Silver should be pushing back on the claim that he gets any particular forecast correct, because calibration is an aggregate measure which can’t apply to single cases. There’s a sense in which he’s right in the aggregate, but no sense in which he’s right in any particular case.
I can see both sides of this, but have a few thoughts.
I think the debate reveals something deeper about people. Predictions aren’t prophecy, just more or less educated guesswork. And we all engage in predictions daily. But we rarely examine the assumptions that went into our predictions unless (sometimes even if) we turn out to be catastrophically wrong (e.g. get in a traffic accident, lose our job, or end up in jail). And then only in hindsight. Silver does that examination for a living, and before knowing the outcome.
This is really the essence of luck – getting our predictions wrong. People who go more or less all in – in poker, traffic, businesses – when they have a 28% chance of losing, have a lot of “bad luck”. People and businesses that are too cautious miss out on “good luck”. But people and businesses that are well calibrated, and properly read tricky situations, will do better in the long run. (Of course, in order to break out of average, you need to make some bets that look to others like long shots, but good long shots are based on unique insight, not lucky gambles.)
Which brings me to my next point: What is news/media (and Nate Silver’s role in it) *for*? What are we paying for? Many people follow the news mostly for the entertainment value and to have something to talk about. They like to read true crime and true horror stories, and to talk about silly celebrities (including business and political celebrities, and prophets like Silver). To them, Silver was wrong. He failed to prophesize Trump's victory.
But there’s more value in news/information that helps you make good decisions – to decide whether to move production to China, to rent or buy, to invest in Pharma companies or chip manufacturers, to take a short cut or stay on the highway, to plant your tomatoes or wait another week or two. To people who use news to make decisions, Silver was right. If you made good sets of decisions based on a 28% chance that Trump would win, hedging and managing risk properly, you’d have beaten the market. (Of course, people were also wrong about what a Trump win would mean, and there was a lot of bad decision making going around, but not because of Silver.)
So you're right that Silver is a bad prophet, but I think he might be a good advisor.
I think this is missing the most important thing to be said in defense of Silver. He gave Trump a much higher chance of winning than pretty much everybody else who was putting numbers to these things. Sam Wang, for example, really was massively discredited by 2016. He gave Clinton a >99% chance of winning. And the reason why Silver's probability was so much lower was based on genuine insight. Roughly, he appreciated how various states were correlated with each other, so he recognized the possibility of a systemic error (e.g., we're underestimating Trump support in the midwest) that Wang's model implicitly ruled out (by treating probabilities of various states going Trump as independent, so that the probability of *all* of them going Trump was vanishingly small). It's stuff like that that makes me think 2016, in retrospect, looks comparatively pretty good for Silver. Most people aren't in the same game as him--making a lot of forecasts and putting numbers on them. The people that are, did worse.