Sports + Numbers: Power rankings make my head hurt

ESPN.com produces some excellent analysis. Several of the writers working there can slice and dice data in ways that reveal thought-provoking, insightful conclusions. I have a weakness, however, for those delightfully qualitative pieces known as power rankings.

Even though I know that I should know better, I read them every week. I literally cannot stop myself from clicking on that link (figuratively, of course – I could quit if I really wanted to). Imagine, then, just how little control I had when I saw the NFL Future Power Rankings link on the lower right side of my screen. These rankings purported to give a quantitative look at the prospects of each team in the 2015 season.

I started to suspect something might be amiss when I looked briefly at the methodology. The methodology was to poll some of ESPN’s analysts and produce a weighted average of the results. There were instructions given for each category that seemed to make sense. Don’t consider players over 27 except at QB; look at the 2012 draft class, the team’s picks in the 2013-15 drafts and their track record in the draft.

In all, the five categories included Roster (ex-QBs), QBs, Draft, Front Office and Coaching Staff. The 4 analysts – Trent Dilfer, Mel Kiper (he of this track record), Gary Horton and Matt Williamson – voted 1 to 10 for each category and that was the extent of it.

The results of the survey looked all too familiar. There was Green Bay at the top, followed closely by New England, the New York Giants, San Francisco, Pittsburgh and Detroit. Not until seventh was there a team, Philadelphia, that failed to make the playoffs. Three of the four conference championship game participants made the top four. It did not appear to be significantly different from simply ordering teams based on their 2011 record. I decided to take a look at whether current year results are really that highly correlated with results in four years?

What’s the correlation, Kenneth?

Now that these rankings had me going (before I even saw my Cleveland Browns ranked dead last) I set about analyzing them. The Future Power Ranking (FPR) number that they produced, a weighted average of 1 to 10 ranks in five categories to produce a 0 to 100 scale, had an average of 61.1 and a standard deviation of 13.7. To make this comparable with winning percentage – the statistic I would consider the parallel to power rankings – we can recenter it around a mean of .50 with a standard deviation of .205, the actual mean and standard deviation of winning percentage in 2011. Thus a team with an FPR of 74.8 would have a predicted 2015 winning percentage of .705, both one standard deviation above the mean.

The figure above shows the implied winning percentages based on FPR on the vertical axis and 2011 actual winning percentage on the horizontal axis. The r-squared, as suspected, is relatively significant at 0.65 with a correlation of 0.80. This means that more than half of the variation in FPR-implied winning percentage is explained simply by looking at a team’s performance in 2011.

This histogram demonstrates the implied change in performance from 2011 to 2015 in terms of wins. The FPR has most of the teams (53%) within +/- one game of their 2011 record. Now all we have to do is see if this is justified.

Salary cap-era persistence of performance

The salary cap, put in place for the 1994 season, should have a significant effect on the persistence of team performance. Teams that assemble outstanding collections of players will see some of them poached as they reach free agency and the team cannot afford to keep them all. Also, I didn’t want to copy records from all the way back to 1920, so the salary cap era will be our set. This data set will offer 14 different four year periods from 1994 to 2007 with paired year 0 (e.g., 2007) and year 4 (e.g., 2011) results.

The figure above shows the winning percentage in year 0 along the horizontal axis and the winning percentage in year 4 along the vertical axis. The r-squared is 0.01. This relationship is effectively meaningless and the graph highlights that fact with as the values are all over the map.

This histogram demonstrates the misalignment between the actual changes over four years (in green) and the changes implied by the ESPN Future Power Ranking (in blue). Of particular note is that the actual data puts 42% of teams changing by four or more games in year 4 as compared to year 0 while the FPR-implied winning percentage has only 9% of teams changing by that much.

Looking at a three-year trailing average of correlation over time, there seems to be a small trend towards higher correlation between winning percentage in year 0 and year 4. There are, however, reasons to believe that the trend may be reversed in the coming years based on changes in the 2011 collective bargaining agreement, and it is still well short of the correlation implied by the FPR.

Here we go Brownies, here we go!

ESPN offered an interesting idea with the NFL Future Power Rankings. The data suggests, however, that their experts are reading too much into current year performance. They need to take the Future Power Rankings back to the drawing board and reevaluate how to predict future performance so that I have something to give me a little more hope that Cleveland might finally be on the right track.

Specifically, they don’t seem willing (able?) to project which teams will have significant swings in the future. The FPR settles for defining the future elite as the current elite and misses the opportunity to provide insightful reasons for which teams will have big changes.

Sports + Numbers

Pages

Thursday, May 31, 2012

Power rankings make my head hurt

1 comment: