Automatic Casting to Float64 in Where clause Filters for Numeric Dimensions (Causing Clustering problems)
Problem
Hi, is there any way to make sure that dimensions with type: number get converted to int instead of float or is there any way to disable automatic casting?
Related Cube.js schema
org_id: {
sql: org_id,
type: number,
},
time_period_id: {
sql: time_period_id,
type: number,
},
These dimensions generate SQL like this:
Related Cube.js generated SQL
AND (fact.org_id = CAST(4 AS FLOAT64)) AND (fact.time_period_id = CAST(108 AS FLOAT64))
Since my columns are in integer it breaks my clustering optimization when wrong casting is happened.
Thanks
@HassanAtWecrunch Which data source type do you use?
it breaks my clustering optimization
Could you please elaborate here?
I use Numeric data type for my numerical columns. BigQuery does not allow columns to cluster on if their data type is something other than
BIGNUMERIC, BOOL, DATE, DATETIME, GEOGRAPHY, INT64, NUMERIC, RANGE, STRING, and TIMESTAMP
What does cluster breaking mean:
BigQuery charges by billable bytes processed during query. Clustering helps to reduce that.
When i write
fact.org_id = 4 i can see my billable bytes being reduced.
If i write
fact.org_id = CAST(4 AS FLOAT64) the billable bytes DO NOT reduce.
Thanks for explaining! I see that this behavior comes from this part of the code base but I'm not sure if removing the cast is the right thing to do (it might break something else): https://github.com/cube-js/cube/blob/6abbdb3ff12ca47c5b2f82f8b4e3d2572bd158fd/packages/cubejs-schema-compiler/src/adapter/BigqueryQuery.ts#L24-L32
Understood. Thanks for the help i@igorlukanin
Hi @igorlukanin i tried to override cast parameter function like this, With the help of material available and AI tools But this does not seem to have any affect. Your help is appreciated. Thanks
Code:
drivers/custom-bigquery-driver.js
// 1) Use Cube's existing BigQuery driver as the base
const { BigQueryDriver } = require('@cubejs-backend/bigquery-driver');
const {
BigqueryQuery
} = require('@cubejs-backend/schema-compiler/dist/src/adapter/BigqueryQuery');
const {
BaseFilter
} = require('@cubejs-backend/schema-compiler/dist/src/adapter/BaseFilter');
// 3) Custom Filter: override castParameter() only
class CustomBigqueryFilter extends BaseFilter {
likeIgnoreCase(column, not, param, type) {
const p = (!type || type === 'contains' || type === 'ends') ? '%' : '';
const s = (!type || type === 'contains' || type === 'starts') ? '%' : '';
return `LOWER(${column})${not ? ' NOT' : ''} LIKE CONCAT('${p}', LOWER(${this.allocateParam(param)}), '${s}')`;
}
castParameter() {
if (this.definition().type === 'boolean') {
return 'CAST(? AS BOOL)';
}
else if (this.measure || this.definition().type === 'number') {
// TODO here can be measure type of string actually
return 'CAST(? AS NUMERIC)';
}
return '?';
}
castToString(sql) {
return `CAST(${sql} as STRING)`;
}
}
// 4) Custom Query: use our CustomBigqueryFilter
class CustomBigqueryQuery extends BigqueryQuery {
// Cube calls newFilter(...) to create filter instances
newFilter(filter) {
return new CustomBigqueryFilter(this, filter);
}
}
class CustomBigqueryDriver extends BigQueryDriver {
static dialectClass() {
return CustomBigqueryQuery;
}
}
module.exports = CustomBigqueryDriver;
### cube.js
const CustomBigqueryDriver = require('./drivers/custom-bigquery-driver');
// Cube.js configuration options: https://cube.dev/docs/config
module.exports = {
driverFactory: () => {
return new CustomBigqueryDriver({});
},
// ..