Share Data Without Sharing Credentials: Introducing Pipe-level Permissions
How to Embed a Live, Refreshable D3.js Chart into GitHub Pages
A 90 Degree Tilt: Introducing Vertical Pipes
A Simple Pipe Routing Example: HTML Upload to HTML Display
Introducing our API and Command Line Interface: Flex.io for Developers
Just Binge-Listened to 95 SaaStr Podcasts, Here's What I Learned
Adding Dynamic Content to a Static Web Page
Lessons from the Data Ecosystem: Part 2
What We've Learned from Exploring the Data Ecosystem: Part 1
This American Data Set, Act III: Weather and the NFL
[Editors note: As a bunch of data geeks, we always enjoy getting our hands dirty exploring interesting data. This is the third of a three-part series on data sets with a story to tell; check out part one and part two. Also, you can find the source data here.]
When looking at how well NFL teams perform, we often talk about everything from offensive formations to coaching and personnel to properly inflated footballs. But what about external factors beyond a team’s control – are there ever any scenarios where the cards are stacked?
Of the possible external factors that might tip the scale, one candidate tops the list: the weather. Because weather can be so wildly unpredictable, though, it’s necessary to look at trends over time to come up with any meaningful analysis. Fortunately, with the NFL’s history, there’s 54 years’ worth of weather data we can crunch.
Let’s see whether (!) team performance can be meaningfully affected by a change in environmental factors.
Before digging into the data, first a brief note about what counts as “performance”. The most important metric in football, of course, is winning. And winning games is not strictly dependent on how many points your team scores – after all, scores like 43-8, 3-0, and 28-24 all give you a “W”.
The factors that go into winning are more conceptual than statistical – how well your coach adapts to the opponent’s game plan, how your players perform, whether your kicker hits that game-tying 50-yarder, etc. And so, measuring any correlation between winning games and a set of statistics like weather is pretty much pointless.
But we can measure a correlation between points scored and weather and see if different conditions affect how the game is played. Weather can be quantified, so I can get some numerical answer from this, statistically significant or otherwise.
And naturally, I excluded indoor games from the analysis.
Starting out, I should briefly identify a trend that has led to the average points-per-game (PPG) increasing over the years:
Because the number of teams has increased over the years, it’s necessary to average the PPG metric to balance this out. And so, I created this graph to account for any possible bias from the changing league rules when we look at how weather has affected performance.
And the correlation data:
Note the positive coefficients for temperature, and negative coefficients for humidity, wind chill, and wind speed. We see the signs for these values reflected in the Pearson coefficient table. This resonates with general intuition on the effects of different weather phenomena.
Also note the exceedingly low coefficient for temperature. To expect one additional point scored, we need to see a 52 degree increase in temperature. Surprisingly, the coefficient was not statistically significant when regressed with humidity and wind speed simultaneously. It was also negative, as opposed to positive when isolated.
Overall, here’s what this analysis tells us:
- Wind speed affects total PPG the most, followed by wind chill; temperature and humidity bring up the rear
- Temperature is the only metric that affects point scored positively — all others have a negative effect
- However, temperature becomes insignificant when all metrics are considered together
I was also surprised to learn that temperature had a negative effect and a higher coefficient when a game took place indoors (which I coded for by regressing when wind speed had a missing value). One possible explanation for this is the need for higher air conditioning pumped into a dome to offset a higher external temperature.
Finally, when I separated the data and regressed on outdoor games only, temperature became irrelevant. This fits with our previous regression that showed temperature is a negligible factor when we regressed on humidity and wind at the same time.
Weather has been known to affect the outcomes of NFL games (a notable one being the Ice Bowl) but this often comes down to coaching, and the effects on team performance are pretty small. Sure, the coefficients are all less than 0.5 – but they are all significant when isolated, which means weather does have a role to play.
Whether weather has any effects on PSI levels in footballs… that may be another story.