Fix League Data View Query Issues
Overview
This issue addresses two critical query generation problems in the league data views system that impact performance and data accuracy:
- From-Table Optimization Missing Scoring Format Hash: Fantasy scoring columns can't be used as from-tables due to missing scoring format hash conditions
- Duplicate Rows in Year Offset Queries: Queries with year_offset ranges and splits produce duplicate rows
Problem Details
1. From-Table Optimization Scoring Format Hash
Issue: The from-table optimization feature currently only supports player_fantasy_points_from_plays due to missing scoring format hash conditions. When scoring format columns like player_fantasy_points_from_seasonlogs are used as the from-table, the required scoring_format_hash join conditions are not properly applied to the base query.
Affected Columns:
-
player_fantasy_points_from_seasonlogs -
player_fantasy_points_per_game_from_seasonlogs -
player_fantasy_points_rank_from_seasonlogs -
player_fantasy_points_position_rank_from_seasonlogs - All
scoring_format_player_seasonlogsbased columns - All
scoring_format_player_careerlogsbased columns - All
league_format_player_seasonlogsbased columns
Root Cause: The scoring_format_player_seasonlogs_join function includes additional_conditions that sets the scoring format hash condition during joins. When these columns are used as the from-table, this join logic isn't applied to the base query setup.
2. Duplicate Rows in Year Offset Queries
Issue: Queries with year_offset ranges and splits are producing duplicate rows. When using year_offset: [1, 3] with splits: ["year"], the generated query returns multiple identical rows for the same player-year combination.
Affected Cases:
- Queries with
year_offsetranges andsplits - Test case:
year-offset-range-with-where-filters.json
Root Cause: The GROUP BY clause or JOIN conditions in year offset queries are not properly eliminating duplicates when aggregating across offset years.
Current State
- From-table optimization whitelist: Only
player_fantasy_points_from_plays✅ - All other columns fall back to default table setup
- Infrastructure exists but needs condition bridging
Technical Requirements
From-Table Optimization Fix
- Apply
additional_conditionslogic when setting up from-table base queries - Ensure
scoring_format_hashparameters flow through to from-table setup - Handle transition from join conditions to from-table base conditions
- Maintain compatibility with existing join scenarios
Duplicate Rows Fix
- Investigate GROUP BY clause in year offset queries
- Review JOIN conditions for offset year aggregations
- Ensure proper deduplication when multiple years are involved
- Fix test case
year-offset-range-with-where-filters.json
Related Files
-
libs-server/get-data-view-results.mjs(from-table optimization logic) -
libs-server/data-views-column-definitions/player-scoring-format-logs-column-definitions.mjs(join functions) -
libs-server/apply-play-by-play-column-params-to-query.mjs(year offset logic) -
test/data-view-queries/year-offset-range-with-where-filters.json(duplicate rows test)
Success Criteria
- [ ] Fantasy points columns work correctly as from-tables with proper scoring format filtering
- [ ] Query performance improves for common fantasy points sorts
- [ ] No duplicate rows in year offset queries with splits
- [ ] All existing functionality remains intact
- [ ] Systematic testing validates each fix
Subtasks
Fix From-Table Scoring Format Hash
- [ ] Implement condition bridging for from-table scenarios
- [ ] Test with scoring format columns
- [ ] Expand whitelist systematically
Fix Duplicate Rows in Year Offset Queries
- [ ] Debug GROUP BY and JOIN logic
- [ ] Test with
year-offset-range-with-where-filters.json - [ ] Ensure correct aggregation across offset years