pybaseball icon indicating copy to clipboard operation
pybaseball copied to clipboard

Scrape Team-level "Batting Against" stats from Baseball-Reference

Open TK2575 opened this issue 2 years ago • 6 comments

  • This PR adds a function, related tests and documentation for capturing team-level "batting against" stats from Baseball-Reference.
  • The page this function scrapes includes both team-level and player-level tables, though the scope of this PR is only team-level.
  • I attempted to follow the same implementation pattern as other functions utilizing the BRef session singleton.
  • I find this to be useful data to evaluate pitching performance in the same terms as batting performance.
  • In looking at other functions, I don't see that this repo takes an opinion on reporting team names in a consistent format, so this function simply reports the full team name as seen on the source page (without the a href URL content)

TK2575 avatar Feb 27 '23 03:02 TK2575

Is there anyone who can assist in debugging or re-running the failed check on this PR? Looks like python -m scripts.statcast_timing took longer than 30 seconds, causing the failure result. I don't know how relevant this is, but I haven't been able to reproduce a run locally greater than 18 seconds. Moreover, my changes shouldn't be utilizing or impacting statcast calls, hence why I was hoping to see how a second run performed. However, I do not see the option to re-run in any of the areas where the GitHub docs indicate, likely indicating I don't have permissions to do so.

TK2575 avatar Feb 27 '23 03:02 TK2575

Yeah, that one just kind of trips randomly sometimes when scraping takes longer than usual. Just manually reran.

tjburch avatar Feb 27 '23 04:02 tjburch

I'd have a few other years tested

I'm glad you mentioned this - trying against some older seasons demonstrated different data patterns than the expectations from newer seasons. I'll make the appropriate adjustments.

TK2575 avatar Mar 10 '23 04:03 TK2575

Thank you @tjburch for your thorough review - changes pushed and ready for re-review.

TK2575 avatar Mar 10 '23 05:03 TK2575

@tjburch looks like the timing test needs a re-run again

TK2575 avatar Mar 10 '23 13:03 TK2575

Switched to draft until I address the team name format diff

TK2575 avatar Mar 28 '23 13:03 TK2575