project summary

< Back

Conclusion

We were able to achieve some success using nflverse data for machine learning. Using the data for machine learning is therefore feasible. Going one step further to create a more interesting model is not out of the question but would require more effort than we had time for in this POC.

Supporting Evidence

Experiment one:

Run using AWS Sagmaker Studio

Objective:

Predict the outcome of play calls (e.g. pass, rush, punt, field goal) based on the down, distance, and field position, with yards_gained and points_gained as the target columns. The data would be at the most granular level of a single play.

Steps:

Start with a simple model such as logistic regression to see if we can predict the yards gained based on the play call in a give situation (e.g. 4th down and 1 yard to go, 3rd down and 10 yards to go, etc.) We'll also use an AutoML ensemble model to see if we can get better results.

Results:

Although the data looked good, a full sweep of AutoML ensemble models did not produce any results better than a random guess. We were not able to predict the yards gained due to team stats and any particular play call, and could not think of any other objective target.

Analysis:

In retrospect, it's possible that we curated the data too well and that, as in experiment 2 we should curate less and let the network figure out how different features contributed to the outcome. We could have spent more time to see if we could create better data and find a better model, but we decided to move on to the next experiment to see if there was a quick hit using a classification model

Experiment two:

Experiment 2 notebook

Objective:

Predict simple win/loss based on the features we identified in the feature engineering phase.

Steps:

Aggregate the data to the game level and see if we can predict wins and losses. We use seasons 2016 - 2021 to predict the 2022 season.

  • Initially we had a multi-class classification model with 3 classes (1 for wins, 0 for ties, and -1 for losses), but we found that the model was not learning the -1 class. We also had curated the data too much: if we had a defensive stat and an offense stat, we might take the difference to achieve one feature that was neither defense or offense. This produced a model that had very high recall but 65% precision.

  • We then removed a lot of the data curation and let the model figure out how to use the features. This produced a model with 87% precision and 85% recall.

Results:

We were able to predict wins and losses with 85%+ accuracy using a simple neural network binary classification model (1 for wins and ties, and 0 for losses).

Analysis:

The validation loss, accuracy, f1 score, confusion matrix and ROC scores all look good. It's likely that the nflverse features that produced the best model are actually masking other more fundamental properties that are not available in the data. Whether it's worth trying to find better data is an open issue.

Last updated