Tuesday, June 23, 2009

Regression to the Mean: And What it Means for Agents

There’s a force more powerful than the Steelers defense or a monster slam from Shaquille O’Neal. It explains everything in sports from the sophomore jinx to unlikely postseason heroes to why slumps occur after hot starts.

“Regression to the mean” profoundly impacts sports statistics, yet you’ll never hear it mentioned on a game broadcast. Regression to the mean holds that as the sample size for a statistic increases, the amount the statistic varies within a group will decrease. In other words, the number of outrageously good and bad percentages will decrease as the season progresses and players/teams see more game action. They approach the mean, just another word for average.

Take the amazing 8.9 yards per carry average posted by Cowboys running back Felix Jones in 2008. He went down for the season after just 30 rushing attempts. Had he not gotten hurt, his yards per carry average would have dropped sharply. Not because defenses would focus on stopping him – Dallas had more dangerous offensive players – but due to regression to the mean.

Among NFL running backs that had at least 100 attempts during the 2008 season, none managed even 6 yards per carry. But many backs had 30-carry stretches when they approached Jones’ figure. The Cowboys rookie just happened to post his average over the course of a shortened season, before regression to the mean could rear its ugly head.

In fact, in the regular season’s final two games, the Giants’ Derrick Ward had a 9.7 yards per carry average in exactly 30 carries. For the season, Ward’s 5.6 average topped all backs with at least 100 rushing attempts. While an impressive feat, that’s a big drop-off from 8 or 9 yards. All caused by regression to the mean.

As an agent, it pays to understand this concept. Should one of your clients jump out to hot start, teams will tend to overvalue him. But he’s likely to see his statistics fall off. It works the other way too. When your player struggles early in the year, his numbers should improve, provided he continues to see game action.

Regression to the mean explains team performance as well. In 2007, four NFL teams won at least 13 games: the Patriots (16-0), Cowboys (13-3), Packers (13-3) and Colts (13-3). They combined for a sizzling .859 winning percentage and 55-9 record. This year, they had a combined 38-26 record and .594 winning percentage. And three of the four teams missed the playoffs! Injuries and other negative factors hit these teams hard, but so did regression to the mean.

In baseball, every postseason brings unlikely heroes. Why does this happen? The short postseason creates a small sample of games where average players can put up great stats before regression to the mean brings them down to earth. The same concept explains why some stars struggle in the postseason. That has little to do with choking – as some may claim – and everything to do with small sample sizes.