cube icon indicating copy to clipboard operation
cube copied to clipboard

MIN / MAX Time aggregation function represented as numeric

Open pauldheinrichs opened this issue 1 year ago • 1 comments

Describe the bug I've noticed when attempting to utilize a time aggregation in a few different ways i'm getting odd behavrious in each scenario.

Scenario A)

type: min
sql: time
  • GraphQL: represented as numeric that breaks the query
  • SQL API: Represented as numeric that always results in null
  • Pre aggregations work... but not in any sort of useful functionality as i can't query the measure

Okay - that's fine let's try something else

Scenario B)

  • Let's do a bit of coercion to make things work
type: time
sql: MIN(time)

  • PreAggregations: no longer are applicable when requesting the measure in my query.
  • Graphql: Represented as string that also returns unqueryable data
  • SQL API: represented as string (this actually works)

Tested:

  • ON redshift driver
  • On Druid driver

No difference

Expected behavior

I would anticipate in scenario 1 to be able to query the metric in scenario 2 i would expect the pre-aggregation to apply Screenshots If applicable, add screenshots to help explain your problem. scenario A) image

Scenario B) image image

Minimally reproducible Cube Schema In case your bug report is data modelling related please put your minimally reproducible Cube Schema here. You can use selects without tables in order to achieve that as follows.

cube(`test`, {
  sql: `
    SELECT 1 as userid, '2021-01-01' as signupdate
    UNION ALL
    SELECT 2 as userid, '2021-01-02' as signupdate
    UNION ALL
    SELECT 3 as userid, '2021-01-03' as signupdate
    UNION ALL
    SELECT 4 as userid, '2021-01-04' as signupdate
  `,
  description: `table`,
  // datasource: `default`,

  measures: {
    // does not work with pre-aggregations
    min_time_broken_pre_aggs: {
      sql: `MIN(signupdate)`,
      type: `time`,
    },

    // is a numeric value unqueryable
    min_time_unqueryable: {
      sql: `signupdate`,
      type: `min`,
    },
  },
 
  dimensions: {
    userid: {
      description: `id generated when user is created`,
      sql: `userid`,
      type: `number`,
      primary_key: true,
    },

    signupdate: {
      description: `date when the user is created`,
      sql: `signupdate`,
      type: `time`,
    },
  },

  preAggregations: {
    test_rollup: {
      type: 'rollup',
      measureReferences: [min_time_broken_pre_aggs],
      dimensionReferences: [userid],
      timeDimensionReference: signupdate,
      granularity: 'day',
      partitionGranularity: 'month',
      refreshKey: {
        every: '1 hour',
        incremental: true,
        updateWindow: '1 hour'
      }
    },

    test_rollup_2: {
      type: 'rollup',
      measureReferences: [min_time_unqueryable],
      dimensionReferences: [userid],
      timeDimensionReference: signupdate,
      granularity: 'day',
      partitionGranularity: 'month',
      refreshKey: {
        every: '1 hour',
        incremental: true,
        updateWindow: '1 hour'
      }
    }
  }
});

Version: [0.35.12]

Additional context I've noticed the same behaviour is represented by max as welll

error logs in scenario 1 where min is represented by numeric

│ 2024-04-16 13:19:05,191 WARN  [cubesql::compile::engine::df::scan] Unable to parse value as f64: invalid float literal                                                                                       │
│ 2024-04-16 13:19:05,191 WARN  [cubesql::compile::engine::df::scan] Unable to parse value as f64: invalid float literal

pauldheinrichs avatar Apr 15 '24 14:04 pauldheinrichs

Hi @pauldheinrichs 👋

Thanks for a very elaborate report!

This is actually the intended behavior, meaning this is how it's currently implemented: according to docs, min and similar functions are only intended to work with numeric values.

I see that you started a PR, thanks for that!

igorlukanin avatar Oct 08 '24 22:10 igorlukanin