Why Your Favorite NFL Team Might Win (or Lose) Five More Games Next Year
This rarely discussed, hidden factor is impacting NFL teams more than you think
by Jason Pauley
The Oakland Raiders have been below average for a long time. Their average record over the last 10 years has been 6–10. But in 2016 the Raiders went 12–4 and were tied for the 3rd best record in the NFL. This 12-win season was bracketed by a seven-win season in 2015 and a six-win season in 2017. In three years they won 7, 12, and 6. This five-win increase and six-win decrease occurred with the same QB (Derek Carr), the same two top WRs (Amari Cooper and Michael Crabtree), the same Pro Bowl linebacker (Khalil Mack) and the same head coach (Jack Del Rio). In this example, I’m using the Raiders to demonstrate the inconsistency of NFL teams from year to year, but I can use almost any team. Here are some stats from the last 20 years to show how frequent these types of swings are in the NFL:
- 22% of all 640 seasons (32 teams X 20 years) resulted in a win/loss shift of 5+ games
- Every team in the last 20 years has had at least one season with a win/loss shift of 5+ games
- Teams average a 5-game win/loss shift every 4.5 years
- An average of one team per season will experience a win/loss shift of 8+ games. That’s the equivalent of a baseball team winning or losing 81 more games than they did in the previous season. A win-shift that makes up 50% of the season’s schedule just doesn’t happen with a large sample size like baseball’s 162-game schedule, but on average it happens every year in the NFL.
Why did the Raiders have such a wild up-and-down swing in wins over the span of three years? It would be foolish to ignore the heavy influence that schedule strength plays on a team’s record each year. Although I mentioned some key players that were constant over those three years, the Raiders went through a number of changes to their personnel, as all teams do every year. Injuries (or lack of injuries) are always one of the biggest factors contributing to a team’s record each year. There are countless variables. However, I think there is a very important factor at play, one that is rarely acknowledged, but a factor that I believe is one of the biggest influencers on a team’s record. This article is about that hidden factor.
Experimenting in a vacuum
I began my analysis by flipping a coin. I flipped it 16 times to mirror an NFL season — heads for a win, tails for a loss. My coin went 10–6, and more than likely that team would make the NFL Playoffs (84% of 10–6 teams have made the playoffs since 1978). But everything we know about a coin is that, when looked at through the lens of wins and losses, it has a .500 winning percentage. A coin doesn’t change players, miss last-second field goals, change coaches or experience an ever-changing strength of schedule each year. If we flip the coin enough times, it will always approach 50% heads and 50% tails. But with such a small sample size like an NFL season, 16 flips isn’t a lot. It’s not much of an anomaly for a coin to end up heads or tails 10 or 11 times, instead of eight. In terms of probability, there is literally nothing more average than a coin flip. This is why I chose to use a coin flip to illustrate the random outcomes of a 16-game NFL season.
When I started this analysis, I used an actual coin that landed on heads 10 times and tails six times. I knew that with enough seasons, the most frequent outcome would be 8–8, but I didn’t know what percentage of seasons would be 8–8. I was also curious to know how often a perfectly average 8–8 team would win or lose enough games to have the one of the best or worst records in the league (about 12 to 14 wins or losses). Is it even possible for a perfectly average team to win 14 games due to the pure randomness of a 16-game schedule? So, to find out I flipped a coin 16,000 times to simulate 1,000 NFL seasons. Okay…I didn’t actually flip a coin as I did at the beginning of my experiment. I simulated 1,000 seasons, 16,000 coin flips using Excel. In an often referenced quote in NFL circles, including in this post by Brian Burke where he uses a statistically sound approach to the coin flip analogy, Bill Parcells once famously said “You are what your record says you are”. But the results from my experiment in the chart below will show you, that’s not true.
As expected a perfectly average team will have an 8–8 record more frequently than any other record, however, they are 8–8 only 21% of the time. Almost 80% of the time their record will be either better or worse than they what their true record should be. Why is this important?
It’s important because big decisions are made all the time in sports and business based on results that might simply be influenced by random outcomes due to a small sample size.
Historically in the NFL, six-win teams have changed head coaches 31% of the time. 10-win teams changed coaches only 6% of the time. Think about that…a team is five times more likely to change their coach with six wins as opposed to 10 wins, yet both are equally likely occurrences for an average 8–8 team playing within the limitations of a 16-game schedule. My simulation shows that an average team has about a 46% probability of ending up with a win total in either of these two ranges: 0 to 6 wins or 10 to 16 wins. These two outcomes combine to be a much higher probability than going 8–8. When impacted only by random outcomes, an average theoretical 8–8 team is much less likely to be 8–8 than they are to be faced with the more extreme outcomes of either making the playoffs or firing their coach.
The very average coin-flip team went 13–3 six times, this would be the best record in the NFL 42% of the time. 17 times, the coin-flip team went 3–13. Random outcomes won’t make an 8–8 team go 13–3 or 3–13 very often, but it should happen to our perfectly average team once every 37 years. However, with 32 teams in the NFL, we don’t have to wait 37 years for random outcomes to cause a team to win five games above or below their true level. We are seeing this nearly every year, we just don’t know which teams are impacted to this degree by random outcomes as opposed to other reasons.
The table below shows more detail from the coin-flip simulation. I included the actual historical playoff rates and coaching change rates for all possible records (excluding ties). Playoff rates and coaching change rates were added to show how likely the coin-flip team might end up in the playoffs or fire their coach, both outcomes which would be unlikely for an actual 8–8 team.
Coin flip simulation of 1,000 seasons for a perfectly average team with no influence from schedule strength or any other factors
A real-life example
Now, I’m going to move from an experimental example to reality. What I like about my coin-flip simulation is that it illustrates randomness without any additional internal or external variables. The only variable is sample size. But in real life, there are many variables that will have a significant influence on a team’s record beyond solely their ability. Below I’m listing real records of real teams. I think you would agree that the teams in Group One have bad records and the teams in Group Two have good records and that you’d much rather be associated with Group Two.
- Team A: 1-15
- Team B: 4-12
- Team C: 4-12
- Team D: 6-10
Team A’s record is historically bad. Only twelve teams in the NFL (1.1%) have had a season with only 1 or 0 wins in the 16-game schedule era (16-game era began in 1978). Teams with B and C’s records are probably in rebuilding mode and are in the bottom three or four teams in the league. Team D isn’t very good, but maybe with some luck and a few tweaks, they can be expected to rise to the middle of the pack soon.
- Team E: 11-5
- Team F: 11-5
- Team G: 10-6
- Team H: 10-6
All the teams in Group Two have records that are likely good enough for the playoffs. 11–5 teams in the NFL will go to the playoffs 98% of the time (see notes for detail)* and 10–6 teams go the playoffs 84% of the time.
So, who are these teams? Well — these are not NFL teams. What I have shown you are real 16-game sample sizes of a few Major League Baseball teams from the 2017 season. Below, I’m showing those numbers again, but this time you will see the teams and their results for the entire year.
- Team A: 1-15 Dodgers, 104-58, best record in baseball, National League Champions
- Team B: 4-12 Astros, 101-61, 2nd best record in AL, World Series Champions
- Team C: 4-12 Diamondbacks, 93-69, 3rd best record in NL, Playoffs
- Team D: 6-10 Red Sox, 93-69, 3rd best record in AL, Playoffs
- Team E: 11-5 White Sox, 67-95, 2nd worst record in AL, 35 games out of 1st place
- Team F: 11-5 Braves, 72-90, 6th worst record in NL, 25 games out of 1st place
- Team G: 10-6 Phillies, 66-96, 2nd worst record in NL, 31 games out of 1st place
- Team H: 10-6 Giants, 64-98, worst record in NL, 40 games out of 1st place
Expanding on the 16-game data to show you the full 162-game season demonstrates the impact that a small sample size can have on a team’s record. With the benefit of having the additional information, you would much rather be a fan, player or manager of Group One.
If baseball had the same number of games as the NFL, and these examples happened to be the 16 games played, the Phillies and White Sox would be going to the playoffs. The Astros and Dodgers might be firing their managers instead of facing off in the World Series.
For additional context and more examples, below is a chart showing the eight best and eight worst MLB teams in 2017. Their overall winning percentage, their best 16 games, and their worst 16 games are displayed on the chart. One of the more interesting results from this chart is that the best 16 games of the teams in the bottom of the league are all better than the worst 16 games of the teams at the top of the league.
Win-Loss records are not predictive in the NFL
Unlike many other sports, in the NFL there is very little consistency in a team’s record from year to year. There are few teams, such as the Patriots who are almost always near the top and the Browns who are almost always near the bottom. But, generally, you can’t look at a team’s record and assume that they will have a similar record the next year. In Chart A below, I have plotted all the records in the NFL since 2010 to find out how one season correlates to the next.
It is not important to know the detail about every data point on the chart, but I have pointed out a few outliers which are highlighted in orange. Take the 2012 Texans as an example (bottom right of Chart A). They had 12 wins in 2012 (year n on the horizontal axis) and only 2 wins in 2013 (year n+1 on the vertical axis).
The important takeaway from this chart is the R-Squared number. The greater the correlation between two sets of data, the closer R-Squared will be to 1.00. The R-Squared in my win comparison analysis is only 0.103. To explain this more clearly, 10.3% of an NFL team’s record in one year can be explained by their record the previous year.
If wins from one year to the next were more strongly correlated, the dots would be gathered much closer to the orange trend line. For comparison, in Chart B, I am showing a hypothetical scenario to demonstrate with fake data, what the chart might look like if there was a high correlation from one year to the next.
With the NFL’s 16-game schedule, teams are constantly experiencing records that are way above or below what their theoretical record should be based on their ability. We just don’t know which teams, what years and by how many games their records are impacted by the random nature of a 16-game sample size. Here are a few interesting recent year-to-year win patterns that highlight potential randomness in the NFL:
Are the Jaguars for real in 2017? Maybe this increase in wins is similar to the coin-flip team that inevitably experiences one of those unlikely random seasons at the far right of the win chart. Certainly, the fact that they were tied for the easiest schedule in the NFL had something to do with the spike in wins. It’s probably a combination of both of scenarios and more.
What’s going on with the Panthers? Over the past five years, the Panthers absolute value in annual win-shift is 6.2 (avg team is at 2.7). No team is even close to this level of inconsistency (Dallas is the next highest at 5.0). They’ve been steady at the two most important roles since 2012 with Cam Newton at QB and Ron Rivera as the coach. What impact does a small sample size have on the Panthers? Are they a good team having some bad luck or are they a bad team having some good luck?
Are the Saints an 11-win team or are they a 7-win team? Perhaps the answer is neither. Maybe they are a 9-win team. If you remember, our coin-flip team only had the correct theoretical record 21% of the time. Being two wins off in either direction is only slightly less common than a having team’s true record.
Throughout NFL history you will find patterns like the ones above. These patterns suggest, among other causes, random outcomes influenced by sample size.
There are countless moving parts and pieces impacting an NFL team. The many internal and external variables will influence a team’s record from year to year. They also make it hard to attribute the portion of a team’s record that is due to the random nature of a 16-game schedule. But what we do know from this analysis is that random variability in the NFL is extremely influential. My coin-flip experiment shows that, in a vacuum, a team’s actual record is usually inconsistent with their theoretical ability-based record; this is true with no variables other than a 16-game season. The baseball example shows that no team over the course of a 162-game season will have an overall season winning percentage even remotely close to their best or worst 16-game sample within the season. Even the very best teams can look like one of the worst teams over a very small number of games.
We should be mindful of sample size in sports or any field that is heavily dependent on data. Variability caused by random outcomes will often influence our strategies. Perhaps, we change a course of action that shouldn’t be changed. Maybe we stick to a strategy that isn’t ideal because random outcomes have pushed our data to a more favorable result. In the small sample size of an NFL season teams are rarely what their record says they are yet decisions are often tied to these results.
Notes and References:
Links to the data sources:
Pro Football Statistics and History | Pro-Football-Reference.com
Complete source for pro football history including complete player, team, and league stats, awards, records, leaders…
MLB Stats, Scores, History, & Records | Baseball-Reference.com
Complete source for baseball history including complete major league player, team, and league stats, awards, records…
Additional detail from the following statement: 11–5 teams in the NFL will go to the playoffs 98% of the time *Since 1978 there have been 97 teams with an 11–5 record, all but two of them went to the playoffs. In 1985 the Denver Broncos went 11–5 but finished second in their division and lost the Wild Card tiebreaker to two other 11–5 teams. In 2008 the Patriots tied for the best record in their division but lost the division title to a tiebreaker. They also missed the Wild Card to a 12–4 team and lost the tiebreaker to another 11–5 team.
Follow me on Twitter for more data visualizations and analysis — mostly sports.