project structure
< Back
The project is structured as follows:
source for EDA jobs in the root of the src/ directory:
nfl_00_load_nflverse_data.py - downloads the nflverse data into local storage
nfl_01_build_nfl_database.py - builds the nfl database from the nflverse data
nfl_02_prepare_weekly_stats.py - merges metrics from all datasets into a single stats dataset
nfl_03_perform_feature_selection.py - performs feature selection on the nfl data
nfl_04_merge_game_features.py - merge our features with the core nfl play actions dataset
nfl_main.py - orchestrates the jobs above
config.py - configuration file for the project
3 notebooks in the root of the notebooks/ directory:
nfl_load_nflverse_data_demo.ipynb - demos manually running the load and build jobs (nfl_00 - 01)
nfl_perform_feature_selection_demo.ipynb - demos manually running weekly stats and feature selection jobs (nfl_02 - 04)
nfl_win_loss_classification_experiment2.ipynb - experiment 2 notebook
ML models in the root of the models/ directory
documentation in the doc/ directory
schemas in the schemas/ directory - used during loading to validate the incoming data from nflverse
Last updated