This American Data Set, Act III: Weather and the NFL

September 15, 2015 by Paulo Nascimento

[Editors note: As a bunch of data geeks, we always enjoy getting our hands dirty exploring interesting data. This is the third of a three-part series on data sets with a story to tell; check out part one and part two. Also, you can find the source data here.]

When looking at how well NFL teams perform, we often talk about everything from offensive formations to coaching and personnel to properly inflated footballs. But what about external factors beyond a team’s control – are there ever any scenarios where the cards are stacked?

Of the possible external factors that might tip the scale, one candidate tops the list: the weather. Because weather can be so wildly unpredictable, though, it’s necessary to look at trends over time to come up with any meaningful analysis. Fortunately, with the NFL’s history, there’s 54 years’ worth of weather data we can crunch.

Let’s see whether (!) team performance can be meaningfully affected by a change in environmental factors.

Winning Isn’t Everything

Before digging into the data, first a brief note about what counts as “performance”. The most important metric in football, of course, is winning. And winning games is not strictly dependent on how many points your team scores – after all, scores like 43-8, 3-0, and 28-24 all give you a “W”.

The factors that go into winning are more conceptual than statistical – how well your coach adapts to the opponent’s game plan, how your players perform, whether your kicker hits that game-tying 50-yarder, etc. And so, measuring any correlation between winning games and a set of statistics like weather is pretty much pointless.

But we can measure a correlation between points scored and weather and see if different conditions affect how the game is played. Weather can be quantified, so I can get some numerical answer from this, statistically significant or otherwise.

And naturally, I excluded indoor games from the analysis.

Show Me the Data

Starting out, I should briefly identify a trend that has led to the average points-per-game (PPG) increasing over the years:

https://plot.ly/~pnascimento/14/average-total-points-scored-per-game-from-1960-to-2013/

Because the number of teams has increased over the years, it’s necessary to average the PPG metric to balance this out. And so, I created this graph to account for any possible bias from the changing league rules when we look at how weather has affected performance.

After I downloaded the data, I cleaned it in Flex.io – sorted the dates, deleted some extraneous columns – then exported to Stata and started running some regressions. Here are my results:

And the correlation data:

Correlate total score with wind, humidity and temperature

Note the positive coefficients for temperature, and negative coefficients for humidity, wind chill, and wind speed. We see the signs for these values reflected in the Pearson coefficient table. This resonates with general intuition on the effects of different weather phenomena.

Also note the exceedingly low coefficient for temperature. To expect one additional point scored, we need to see a 52 degree increase in temperature. Surprisingly, the coefficient was not statistically significant when regressed with humidity and wind speed simultaneously. It was also negative, as opposed to positive when isolated.

The Results

Overall, here’s what this analysis tells us:

  • Wind speed affects total PPG the most, followed by wind chill; temperature and humidity bring up the rear
  • Temperature is the only metric that affects point scored positively — all others have a negative effect
  • However, temperature becomes insignificant when all metrics are considered together

I was also surprised to learn that temperature had a negative effect and a higher coefficient when a game took place indoors (which I coded for by regressing when wind speed had a missing value). One possible explanation for this is the need for higher air conditioning pumped into a dome to offset a higher external temperature.

Finally, when I separated the data and regressed on outdoor games only, temperature became irrelevant. This fits with our previous regression that showed temperature is a negligible factor when we regressed on humidity and wind at the same time.

One Factor Among Many

Weather has been known to affect the outcomes of NFL games (a notable one being the Ice Bowl) but this often comes down to coaching, and the effects on team performance are pretty small. Sure, the coefficients are all less than 0.5 – but they are all significant when isolated, which means weather does have a role to play.

Whether weather has any effects on PSI levels in footballs… that may be another story.