Wednesday, September 5, 2012

Why are NFL preseason win projections so bad?



I have been unable to avoid looking at predictions for the upcoming NFL season. After spending some time picking apart ESPN’s Future Power Rankings back in May, these predictions have again piqued my interest – so I will pick them apart too. When will they ever learn that I will go away if they pick the Browns to make the playoffs (I’m not even asking for a Super Bowl, just to make the playoffs and maybe also to have Pittsburgh not do well).

Projecting win totals is definitely hard. Brian Burke, the very talented founder of AdvancedNFLStats.com, has had a significant amount of fun deconstructing Football Outsiders’ projections over the years. Among other things, he has shown that the results could be improved by picking all teams to finish 8-8 (he refers to this as the CoMA strategy) and picking all teams to regress somewhat by projecting 6 wins plus ¼ of the prior year’s win total – 10-6 becomes 6 + 10/4 = 8.5 wins (he refers to this as the ‘Koko the Monkey’ strategy, fans of Seinfeld approve).


Most of the projections I rounded up for this year have an even simpler approach. They just pick teams to achieve basically the same result as the year before.  The data are below, but the projections range from an R^2 of 0.49 (Mike Greenberg of ESPN’s Mike and Mike in the Morning) to 0.71 (Accuscore) based on the prior year’s win total; the other projections are sourced from Mike Golic (the other Mike in the morning), Gregg Easterbrook (TMQ) and Peter King (Sports Illustrated). To state this clearly, between 50% and 70% of the variation in wins can be explained by wins in the previous year.






 
From 1994-2011 – starting in 1994 because that is when the NFL’s modern salary cap and free agency rules went into effect – a team’s record the prior year has explained 9% of the variation in their present year record. 9% is significantly less than 49% (or 71%). 



When we look at a histogram of the projected changes compared to that of the actual changes for the period, the actual looks almost completely flat within the -3 to +3 range and much larger in the ‘+4 or more’ and ‘-4 or more’ tails.






 
So why do these totals miss by so much? I have three explanations, all of which I believe are affecting the results to varying degrees:

1.    There are unpredictable factors at work here – The NFL season is a very small sample size, so a team’s win total does not always reflect their performance accurately. Bill Barnwell has gone over several of them in his previews of this year. A team’s turnover margin, record in close games and injuries can vary significantly year to year. Furthermore, there is no persistence in these metrics. The cohort of teams worse than .250 in close games has roughly the same expected record in those games next year as the cohort of teams better than .750 – teams with great quarterbacks being the exception. Injuries and turnover margin show similar variation. All of these factors can result in win totals that diverge significantly from Pythagorean wins – the expected win total based on points for and points against.

2.    Experts face disproportionate risk from non-conformity – Let’s say, hypothetically, that you are an NFL expert and your sources have reported that John Skelton looks amazing. He is Peyton Manning, Tom Brady and Roger Staubach all rolled into one (really? Staubach?) and somehow no one else has this scoop. What do you do with this information? If you put it in your report you will get laughed at. If you project Arizona to win 12 games and make the Super Bowl you are suddenly putting your own job on the line when (see above) there are other factors that will impact the result besides your inside info. If you project Arizona to have 5 wins and they end up with 12 no one will notice because the other experts have them at 5 wins too. There has been significant research done (see here, here and here) on this topic by some of the leading non-Harvard business schools as it relates to analysts making buy/sell recommendations on stocks. Similar to sell-side analysts, NFL experts may face backlash from the teams they follow for a non-conforming response to the negative side. Predicting Cleveland to stink is fine, they do stink. Predicting the Steelers to stink may have serious effect on your ability to work in Pittsburgh.

3.    NCAA Tournament Pools – Ok, this isn’t really a factor but stick with me. When you pick your brackets for the NCAA tournament you know that a 12 seed will beat a 5 seed. Let’s go further and say that you have been told that a single 12 will upset a single 5. What do you do? If you pick one upset (out of the four 12/5 matchups) there is a 75% chance you picked the wrong one, causing you to lose both the projected upset and the real upset. By picking an upset you have a 75% chance at getting two of four games right and a 25% chance at getting all four – an expected value of 2.5 wins. By picking the 5 seed in all matchups you have a 100% chance of getting three games right. Applying this logic to the NFL, these guys know that roughly six of the 16 playoff teams will not make the playoffs the following year. They can try to pick which six will not, or they can step back and realize that a majority of playoff teams will make the playoffs the following year. Even though prior year wins are not a strong predictor of present year wins, they are probably the best thing an analyst has to go on given that the other factors (see number one) are unknown ex ante.

The NFL experts, it seems, either don’t know or won’t say anything more than most fans already know from looking at last year’s record. The lesson, as always, is that Cleveland still has a chance for a magical season.

UPDATE - 9/7/12
Since I published this Bill Simmons and Bill Barnwell at Grantland have put out their forecasts for the 2012 season. Since I like their writing I figured I would run the numbers on their submissions. Both have a far more realistic projection in terms of reducing the impact of prior year record. Only 25% of the variation in Simmons' forecast and 26% of the variation in Barnwell's can be attributed to the prior year win total. The lowest of the original group of projections was Mike Greenberg at 49%. So congrats to Bill and Bill on what I would consider to be much more realistic predictions. For the reasons listed above, however, that may just make them higher variance (more likely to be very right or very wrong).

No comments:

Post a Comment