Tuesday, October 12, 2010

Nate Silver's Election Forcasts

Last night I gave you my current forecast of the 2011 House based on a model I'm following, and using the generic Congressional ballot poll results reported on RealClearPolitics.

The numerical technology used in that model is similar to that used by Nate Silver of what once was FiveThirtyEight.com.  Since the liberal local New York Paper (LLNYP) now owns Silver's website, you can find his forecasts here.  His predictions are very detailed, and are reported as probabilities, e.g., he reports that Pearce has a 70% chance of winning NM-2.  Local knowledge tells us to discount the polls Silver's forecast relies on here, a bias he calls house effect and adjusts for when he has previous knowledge of it.

Silver explains his approach in detail on pages linked to from that LLNYP page, but it's truly rocket science: hundreds of thousands of Monte Carlo simulations of each of the 435 House districts with random choices of election outcomes from probability distributions based on polls in those races with individual adjustments for reliability of the pollster, for incumbency, and for historical race-by-race factors from Charlie Cook.  There's a little too much man-behind-the-curtain tinkering for my taste, but there's no denying the care he's taking.

There is one factor missing from his and all of these analyses, including mine: correlation.  All of the random draws from distributions in his national Monte Carlo simulation are done without regard for the correlation of results across the nation.  Thus the draws in NM-1 and NM-2 have a 50-50 chance of being on opposite sides of the mean predicted result.  Since the standard deviations are  pretty large -- they are basically the poll margins of error -- a given simulation can have NM-1 at the poll result plus 4% and the NM-2 result at its poll result minus 4%.  Such disparities are much more unlikely in neighboring districts in the election than in the simulation.  The reason they have to be considered is that those deviations are actually in the poll results because of sampling error, not in the population being polled.

The polling data is much better in the Senate races, with larger samples and more polls by better polling organizations, all of which lower the errors in sampling.  You can access Silver's Senate predictions by hovering your mouse pointer over each state in the map here. Similar maps for the House and the Governor's races can be found through the links in his "Forecast Center" at the right of his main page.