nflreadr icon indicating copy to clipboard operation
nflreadr copied to clipboard

Data documentation

Open tanho63 opened this issue 4 years ago • 2 comments

It would be great if we had consistent data documentation available within the package. At the moment, the following data functions are missing in-package dictionaries:

  • [x] load_trades (#75 - thank you, @pranavrajaram)
  • [x] load_injuries (#75 - thank you, @pranavrajaram)
  • [x] load_depth_charts (#75 - thank you, @pranavrajaram)
  • [x] load_espn_qbr (#74 - thank you, @pranavrajaram)
  • [x] load_combine (#75 - thank you, @pranavrajaram)
  • [ ] load_player_stats - kicking
  • [ ] load_player_stats - defense (#192 - @mpcen )
  • [ ] load_pfr_advstats revisit (pass, rush, rec, def)
  • [ ] load_participation

missing fields as per comments below:

  • [x] schedules (#94)
  • [x] snap counts (#94)

tanho63 avatar Aug 09 '21 15:08 tanho63

Data docs is an ongoing battle! Along with dictionaries missing above, here are some missing fields for existing dictionaries:

R> load_schedules() |>
+++   dict_check(dictionary_schedules)
    old                | new                         
                       - "alt_game_id" [1]           
[1] "away_coach"       | "away_coach"  [2]           
[2] "away_moneyline"   -                             
[3] "away_qb_id"       -                             
[4] "away_qb_name"     -                             
[5] "away_rest"        -                             
[6] "away_score"       | "away_score"  [3]           
[7] "away_spread_odds" -                             
[8] "away_team"        | "away_team"   [4]           
[9] "div_game"         -                             
... ...                  ...           and 3 more ...

     old                | new                         
[14] "gametime"         | "gametime"   [9]            
[15] "gsis"             | "gsis"       [10]           
[16] "home_coach"       | "home_coach" [11]           
[17] "home_moneyline"   -                             
[18] "home_qb_id"       -                             
[19] "home_qb_name"     -                             
[20] "home_rest"        -                             
[21] "home_score"       | "home_score" [12]           
[22] "home_spread_odds" -                             
[23] "home_team"        | "home_team"  [13]           
 ... ...                  ...          and 10 more ...

     old           | new                         
[34] "season"      | "season"      [18]          
[35] "spread_line" | "spread_line" [19]          
[36] "stadium"     | "stadium"     [20]          
[37] "stadium_id"  -                             
[38] "surface"     | "surface"     [21]          
[39] "temp"        | "temp"        [22]          
[40] "total"       | "total"       [23]          
[41] "total_line"  | "total_line"  [24]          
[42] "under_odds"  -                             
[43] "week"        | "week"        [25]          
 ... ...             ...           and 2 more ...
R> load_snap_counts() |>
+++   dict_check(dictionary_snap_counts)
     old             | new                           
 [1] "defense_pct"   | "defense_pct"   [1]           
 [2] "defense_snaps" | "defense_snaps" [2]           
 [3] "game_id"       | "game_id"       [3]           
 [4] "game_type"     -                               
 [5] "offense_pct"   | "offense_pct"   [4]           
 [6] "offense_snaps" | "offense_snaps" [5]           
 [7] "opponent"      -                               
 [8] "pfr_game_id"   | "pfr_game_id"   [6]           
 [9] "pfr_player_id" | "pfr_player_id" [7]           
[10] "player"        | "player"        [8]           
 ... ...               ...             and 6 more ...

old here refers to the actual dataframe, while new here refers to the current dictionary. We want to make sure that the dictionary (new) matches the data (old).

To rerun these checks, consult data-raw/dictionary_check.R

tanho63 avatar Mar 17 '22 02:03 tanho63

Not listed but https://github.com/nflverse/nflreadr/pull/192 takes care of playerstats_def

mpcen avatar Aug 04 '23 15:08 mpcen