cube
cube copied to clipboard
[cubestore] corrupt data error: File **.parquet doesn't exist in remote file system
Problem
We keeps having this error msg on daily basis in our cubestore logs:
...
2025-01-30 15:17:46.715 | 2025-01-30T04:17:46.715Z ERROR [cubestore::queryplanner::query_executor] <pid:1> Error Query (126.880301ms):
2025-01-30 15:17:46.715 | 2025-01-30T04:17:46.715Z INFO [cubestore::metastore] <pid:1> Deactivating table prod_pre_aggregations.gl_income_rollup20240101_kla32oge_di0ibpro_1jkd7n6 (#2427) due to corrupt data error: File 9165-ncfolk1k.parquet doesn't exist in remote file system
...
Related Cube.js schema
cube(`GeneralLedger_Income`, {
sql: `${incomeSQL}
where ${incomeSQLFilter}
`,
extends: GeneralLedger_BaseIncome,
sqlAlias: 'glIncome',
preAggregations: {
rollup: {
measures: [CUBE.incomeAmount, CUBE.count, CUBE.managementFeeIncomeAmount, CUBE.minDate],
dimensions: [CUBE.transactionTaxCategoryId, CUBE.generalLedgerManagement, CUBE.accountOwner, CUBE.accountBookType],
timeDimension: CUBE.createdAt,
granularity: `day`,
indexes: {
idx: {
columns: [ CUBE.accountOwner, CUBE.generalLedgerManagement],
},
},
refresh_key: {
every: `30 6 * * *`,
timezone: `Australia/Sydney`,
incremental: true,
update_window: `60 days`,
},
partition_granularity: `month`,
build_range_start: {
sql: `SELECT '2020-08-11'::timestamp AT TIME ZONE 'utc'`,
},
build_range_end: {
sql: `SELECT NOW()`,
},
},
},
measures: {},
dimensions: {},
segments: {},
dataSource: `generalLedger`,
});
Related Query
2025-01-30 15:17:46.716 | 2025-01-30T04:17:46.715Z ERROR [cubestore::queryplanner::query_executor] <pid:1> Error Query Physical Plan (126.880301ms): GlobalLimit, n: 3000 | Â
-- | -- | --
 |  | 2025-01-30 15:17:46.716 | Scan p_m__fee_tax_categories__rollup, source: CubeTable(index: p_m__fee_tax_categories_rollup_idx_h4fwcpsc_4l4cgk5j_1jjaiga:1346:[2648, 2684]:sort_on[p_m__fee_tax_categories__tax_category_id]), fields: * | Â
 |  | 2025-01-30 15:17:46.716 | Scan groups_by_ma__g_b_m_rollup, source: CubeTable(index: groups_by_ma_g_b_m_rollup_mg_idx_xde2o1oi_sdfcdeov_1jpls5j:16773:[32798]:sort_on[groups_by_ma__management_ailorn]), fields: [groups_by_ma__management_ailorn] | Â
 |  | 2025-01-30 15:17:46.716 | Scan prod_pre_aggregations.gl_income_rollup20240601_y1vcbtr3_tvbyjwvf_1jplegd, source: CubeTable(index: gl_income_rollup_idx_y1vcbtr3_tvbyjwvf_1jplegd:16738:[32734, 32749]:sort_on[gl_income__account_owner, gl_income__general_ledger_management]), fields: [gl_income__account_book_type, gl_income__account_owner, gl_income__general_ledger_management, gl_income__transaction_tax_category_id, gl_income__created_at_day, gl_income__count, gl_income__income_amount, gl_income__management_fee_income_amount] | Â
 |  | 2025-01-30 15:17:46.716 | Scan prod_pre_aggregations.gl_income_rollup20240501_agnnisxe_is1zsk0r_1jplegd, source: CubeTable(index: gl_income_rollup_idx_agnnisxe_is1zsk0r_1jplegd:16736:[32732, 32748]:sort_on[gl_income__account_owner, gl_income__general_ledger_management]), fields: [gl_income__account_book_type, gl_income__account_owner, gl_income__general_ledger_management, gl_income__transaction_tax_category_id, gl_income__created_at_day, gl_income__count, gl_income__income_amount, gl_income__management_fee_income_amount] | Â
 |  | 2025-01-30 15:17:46.716 | Scan prod_pre_aggregations.gl_income_rollup20240401_4rr0etsz_50d11yiv_1jplesf, source: CubeTable(index: gl_income_rollup_idx_4rr0etsz_50d11yiv_1jplesf:16740:[32740, 32751]:sort_on[gl_income__account_owner, gl_income__general_ledger_management]), fields: [gl_income__account_book_type, gl_income__account_owner, gl_income__general_ledger_management, gl_income__transaction_tax_category_id, gl_income__created_at_day, gl_income__count, gl_income__income_amount, gl_income__management_fee_income_amount] | Â
 |  | 2025-01-30 15:17:46.715 | Scan prod_pre_aggregations.gl_income_rollup20240301_msq2aw0n_ykwidr0u_1jkd5ic, source: CubeTable(index: gl_income_rollup_idx_msq2aw0n_ykwidr0u_1jkd5ic:4619:[9073, 9096]:sort_on[gl_income__account_owner, gl_income__general_ledger_management]), fields: [gl_income__account_book_type, gl_income__account_owner, gl_income__general_ledger_management, gl_income__transaction_tax_category_id, gl_income__created_at_day, gl_income__count, gl_income__income_amount, gl_income__management_fee_income_amount] | Â
 |  | 2025-01-30 15:17:46.715 | Scan prod_pre_aggregations.gl_income_rollup20240201_tjjj3bbi_aujhlb1e_1jkd63j, source: CubeTable(index: gl_income_rollup_idx_tjjj3bbi_aujhlb1e_1jkd63j:4629:[9091, 9112]:sort_on[gl_income__account_owner, gl_income__general_ledger_management]), fields: [gl_income__account_book_type, gl_income__account_owner, gl_income__general_ledger_management, gl_income__transaction_tax_category_id, gl_income__created_at_day, gl_income__count, gl_income__income_amount, gl_income__management_fee_income_amount] | Â
 |  | 2025-01-30 15:17:46.715 | Scan prod_pre_aggregations.gl_income_rollup20240101_kla32oge_di0ibpro_1jkd7n6, source: CubeTable(index: gl_income_rollup_idx_kla32oge_di0ibpro_1jkd7n6:4657:[9143, 9165]:sort_on[gl_income__account_owner, gl_income__general_ledger_management]), fields: [gl_income__account_book_type, gl_income__account_owner, gl_income__general_ledger_management, gl_income__transaction_tax_category_id, gl_income__created_at_day, gl_income__count, gl_income__income_amount, gl_income__management_fee_income_amount] | Â
 |  | 2025-01-30 15:17:46.715 | Scan prod_pre_aggregations.gl_income_rollup20231201_eevk4axq_angupusd_1jpls6k, source: CubeTable(index: gl_income_rollup_idx_eevk4axq_angupusd_1jpls6k:16781:[32810, 32822]:sort_on[gl_income__account_owner, gl_income__general_ledger_management]), fields: [gl_income__account_book_type, gl_income__account_owner, gl_income__general_ledger_management, gl_income__transaction_tax_category_id, gl_income__created_at_day, gl_income__count, gl_income__income_amount, gl_income__management_fee_income_amount] | Â
 |  | 2025-01-30 15:17:46.715 | Scan prod_pre_aggregations.gl_income_rollup20231101_m0i0v5sg_na1ecvvu_1jpls6k, source: CubeTable(index: gl_income_rollup_idx_m0i0v5sg_na1ecvvu_1jpls6k:16779:[32808, 32823]:sort_on[gl_income__account_owner, gl_income__general_ledger_management]), fields: [gl_income__account_book_type, gl_income__account_owner, gl_income__general_ledger_management, gl_income__transaction_tax_category_id, gl_income__created_at_day, gl_income__count, gl_income__income_amount, gl_income__management_fee_income_amount] | Â
 |  | 2025-01-30 15:17:46.715 | Scan prod_pre_aggregations.gl_income_rollup20231001_y4xyjrqr_jg1k5lim_1jplslq, source: CubeTable(index: gl_income_rollup_idx_y4xyjrqr_jg1k5lim_1jplslq:16783:[32814, 32828]:sort_on[gl_income__account_owner, gl_income__general_ledger_management]), fields: [gl_income__account_book_type, gl_income__account_owner, gl_income__general_ledger_management, gl_income__transaction_tax_category_id, gl_income__created_at_day, gl_income__count, gl_income__income_amount, gl_income__management_fee_income_amount] | Â
 |  | 2025-01-30 15:17:46.715 | Scan prod_pre_aggregations.gl_income_rollup20230901_ans5yvsu_05ydwpmt_1jpls7t, source: CubeTable(index: gl_income_rollup_idx_ans5yvsu_05ydwpmt_1jpls7t:16777:[32806, 32818]:sort_on[gl_income__account_owner, gl_income__general_ledger_management]), fields: [gl_income__account_book_type, gl_income__account_owner, gl_income__general_ledger_management, gl_income__transaction_tax_category_id, gl_income__created_at_day, gl_income__count, gl_income__income_amount, gl_income__management_fee_income_amount] | Â
 |  | 2025-01-30 15:17:46.715 | Scan prod_pre_aggregations.gl_income_rollup20230801_30cmavvy_zpgdclsh_1jplt4k, source: CubeTable(index: gl_income_rollup_idx_30cmavvy_zpgdclsh_1jplt4k:16787:[32826, 32833]:sort_on[gl_income__account_owner, gl_income__general_ledger_management]), fields: [gl_income__account_book_type, gl_income__account_owner, gl_income__general_ledger_management, gl_income__transaction_tax_category_id, gl_income__created_at_day, gl_income__count, gl_income__income_amount, gl_income__management_fee_income_amount] | Â
 |  | 2025-01-30 15:17:46.715 | Scan prod_pre_aggregations.gl_income_rollup20230701_zgkopi5w_wflmgv1y_1jpls7t, source: CubeTable(index: gl_income_rollup_idx_zgkopi5w_wflmgv1y_1jpls7t:16775:[32804, 32816]:sort_on[gl_income__account_owner, gl_income__general_ledger_management]), fields: [gl_income__account_book_type, gl_income__account_owner, gl_income__general_ledger_management, gl_income__transaction_tax_category_id, gl_income__created_at_day, gl_income__count, gl_income__income_amount, gl_income__management_fee_income_amount] | Â
 |  | 2025-01-30 15:17:46.715 | Union | Â
 |  | 2025-01-30 15:17:46.715 | Scan p_m__teams__p_m_teams_rollup, source: CubeTable(index: p_m__teams_p_m_teams_rollup_mg_idx_lpogm2te_gablnqqs_1jpls5i:16771:[32796, 32813]), fields: [p_m__teams__legal_entity_ailorn, p_m__teams__management_ailorn, p_m__teams__organisation_id, p_m__teams__property_type] | Â
 |  | 2025-01-30 15:17:46.715 | Filter | Â
 |  | 2025-01-30 15:17:46.715 | Join on: [#p_m__teams__p_m_teams_rollup.p_m__teams__legal_entity_ailorn = #gl_income__rollup.gl_income__account_owner, #p_m__teams__p_m_teams_rollup.p_m__teams__management_ailorn = #gl_income__rollup.gl_income__general_ledger_management] | Â
 |  | 2025-01-30 15:17:46.715 | Filter | Â
 |  | 2025-01-30 15:17:46.715 | Join on: [#p_m__teams__p_m_teams_rollup.p_m__teams__management_ailorn = #groups_by_ma__g_b_m_rollup.groups_by_ma__management_ailorn] | Â
 |  | 2025-01-30 15:17:46.715 | Join on: [#gl_income__rollup.gl_income__transaction_tax_category_id = #p_m__fee_tax_categories__rollup.p_m__fee_tax_categories__tax_category_id] | Â
 |  | 2025-01-30 15:17:46.715 | ClusterSend, indices: [[16771], [16775, 16787, 16777, 16783, 16779, 16781, 4657, 4629, 4619, 16740, 16736, 16738], [16773], [1346]] | Â
 |  | 2025-01-30 15:17:46.715 | Aggregate | Â
 |  | 2025-01-30 15:17:46.715 | Projection, [gl_income__transaction_tax_category_id, p_m__fee_tax_categories__fee_tax_category_name, gl_income__income_amount, gl_income__management_fee_income_amount, gl_income__count] | Â
 |  | 2025-01-30 15:17:46.715 | Sort | Â
 |  | 2025-01-30 15:17:46.715 | Limit
Other cube also had this issue constantly, the client query ends up timesout.
error: Error while querying queueId="398" queueSize="0" duration="97" queryKey="[\"SELECT `chat__m_r_t_by_agency__median_response_time` `chat__m_r_t_by_agency__median_response_time` FROM prod_pre_aggregations.chat__m_r_t_by_agency_rollup_bgmxzqjp_hvtni333_1kjespr AS `chat__m_r_t_by_agency__rollup` GROUP BY 1 ORDER BY 1 ASC LIMIT 100\",[]]" queuePrefix="SQL_QUERY_EXT_STANDALONE" requestId="3878f8db-9a0e-4af7-a8e2-482ab8fb0bd0-span-1" timeInQueue="0" error="Error: Internal: Execution error: CorruptData: File 356210-6fxdlqjv.parquet doesn't exist in remote file system at WebSocket.<anonymous> (/cube/node_modules/@cubejs-backend/cubestore-driver/src/WebSocketConnection.ts:132:32) at WebSocket.emit (node:events:518:28) at WebSocket.emit (node:domain:489:12) at Receiver.receiverOnMessage (/cube/node_modules/ws/lib/websocket.js:1070:20) at Receiver.emit (node:events:518:28) at Receiver.emit (node:domain:489:12) at Receiver.dataMessage (/cube/node_modules/ws/lib/receiver.js:502:14) at Receiver.getData (/cube/node_modules/ws/lib/receiver.js:435:17) at Receiver.startLoop (/cube/node_modules/ws/lib/receiver.js:143:22) at Receiver._write (/cube/node_modules/ws/lib/receiver.js:78:10)" level="error"
Cube model:
cube('Chat_MRTByAgency', {
sql: `
with response_time_by_organisation as (
select
split_part( organisation_ailorn, ':', '4') as organisation_id,
median_response_time_last_thirty_days
from data_export.organisation_median_response_time_report
)
select
organisation_id,
median_response_time_last_thirty_days
from response_time_by_organisation
where ${SECURITY_CONTEXT.organisationId.filter('organisation_id')}`,
preAggregations: {
rollup: {
dimensions: [
CUBE.medianResponseTime,
CUBE.organisationId
],
indexes: {
idx: {
columns: [CUBE.organisationId],
}
},
refresh_key: {
every: "1 hour",
}
},
},
dimensions: {
organisationId: {
sql: `organisation_id`,
type: `string`
},
medianResponseTime: {
sql: `median_response_time_last_thirty_days`,
type: `number`
}
},
dataSource: `chat`
});
I feel like this kind of out of date cache issue will always introduce small breaking windows to our client query.
@igorlukanin this could be a bug on how pre-aggregation is being updated, ideally it should have 0 downtime. Could you pls provide some insights.
Cube version: 1.3.26