iceberg icon indicating copy to clipboard operation
iceberg copied to clipboard

Calling `rewrite_position_delete_files` rewrites into same amount of files

Open bk-mz opened this issue 1 year ago • 10 comments

Apache Iceberg version

1.4.3 (latest release)

Query engine

Spark

Please describe the bug 🐞

Hey folks, we're using rewrite_position_delete_files to compact delete files.

It keeps rewriting data but it does not compact anything, just rewrites files with same amount of data into same amount of files.

CALL glue.system.rewrite_position_delete_files(table => 'table_name', where => 'data_load_ts < current_timestamp() - INTERVAL 1 HOURS', options => map('partial-progress.enabled', 'true', 'rewrite-all', 'true', 'max-concurrent-file-group-rewrites', '50'))
+----------------------------+------------------------+---------------------+-----------------+
|rewritten_delete_files_count|added_delete_files_count|rewritten_bytes_count|added_bytes_count|
+----------------------------+------------------------+---------------------+-----------------+
|5474                        |5232                    |83456097             |82859000         |
+----------------------------+------------------------+---------------------+-----------------+


CALL glue.system.rewrite_position_delete_files(table => 'table_name', where => 'data_load_ts < current_timestamp() - INTERVAL 1 HOURS', options => map('partial-progress.enabled', 'true', 'rewrite-all', 'true', 'max-concurrent-file-group-rewrites', '50'))
+----------------------------+------------------------+---------------------+-----------------+
|rewritten_delete_files_count|added_delete_files_count|rewritten_bytes_count|added_bytes_count|
+----------------------------+------------------------+---------------------+-----------------+
|5431                        |5265                    |83739802             |83200333         |
+----------------------------+------------------------+---------------------+-----------------+

CALL glue.system.rewrite_position_delete_files(table => 'table_name', where => 'data_load_ts < current_timestamp() - INTERVAL 1 HOURS', options => map('partial-progress.enabled', 'true', 'rewrite-all', 'true', 'max-concurrent-file-group-rewrites', '50'))
+----------------------------+------------------------+---------------------+-----------------+
|rewritten_delete_files_count|added_delete_files_count|rewritten_bytes_count|added_bytes_count|
+----------------------------+------------------------+---------------------+-----------------+
|5443                        |5244                    |83643303             |83241939         |
+----------------------------+------------------------+---------------------+-----------------+

As a matter of fact I think it has created an odd partitions which contain only small delete files. I suspect what that job does is to keeps rewriting those small files all over again having same small files in the end.

Normal partition on s3: data_load_ts_hour=2024-02-29-06/ Odd partition: data_load_ts_hour=474425/

There are a lot of those odd partitions. They have an integer which is incrementally increasing from 474425 till 474754. I think each run creates a new odd partition.

image

Odd partition contains only delete parquet files image

Can you check and confirm whether this is an issue? So far we had disabled rewrite_position_delete_files at all b/c the behavior is super-odd.

Thanks!

bk-mz avatar Feb 29 '24 07:02 bk-mz

have you tried option 'rewrite-all', 'true'?

manuzhang avatar Feb 29 '24 07:02 manuzhang

Yes,

CALL glue.system.rewrite_position_delete_files(table => 'table_name', where => 'data_load_ts < current_timestamp() - INTERVAL 1 HOURS', options => map('partial-progress.enabled', 'true', 'rewrite-all', 'true', 'max-concurrent-file-group-rewrites', '50'))

bk-mz avatar Feb 29 '24 09:02 bk-mz

Hey @bk-mz , If you are trying to compact the files with positional deletes and remove you need to run compaction on the data files themselves , like this:

  1. first run rewrite_position_delete_files
CALL catalog_name.system.rewrite_data_files(table => 'db.sample', options => map('delete-file-threshold','1'))

amitgilad3 avatar Feb 29 '24 18:02 amitgilad3

I investigated a little.

So it seems that iceberg keeps partitions mapped to some form of id. I.e. 2024-02-29-06 partition is translated to 474425. Apparently running both rewrite_data_files and rewrite_position_delete_files has forced iceberg to leak those internal partitions to filesystem.

spark-sql ()> SELECT * FROM database.table.partitions;
{"data_load_ts_hour":474111}	0	31581863	67	5518171238	0	0	0	0	2024-03-01 11:39:04.284	2980406515838442447
{"data_load_ts_hour":474110}	0	27528941	59	4744718083	0	0	0	0	2024-03-01 11:39:04.284	2980406515838442447
{"data_load_ts_hour":474113}	0	35247584	75	6106815135	0	0	0	0	2024-03-01 11:39:04.284	2980406515838442447
{"data_load_ts_hour":474112}	0	35767820	76	6203474378	0	0	0	0	2024-03-01 11:39:04.284	2980406515838442447
{"data_load_ts_hour":474115}	0	33848781	73	5714870794	0	0	0	0	2024-03-01 11:39:04.284	2980406515838442447
{"data_load_ts_hour":474114}	0	33251894	72	5706434958	0	0	0	0	2024-03-01 11:39:04.284	2980406515838442447
{"data_load_ts_hour":474117}	0	26825760	56	4575503869	0	0	0	0	2024-03-01 11:39:04.284	2980406515838442447
{"data_load_ts_hour":474116}	0	29780249	64	5100337983	0	0	0	0	2024-03-01 11:39:04.284	2980406515838442447
{"data_load_ts_hour":474109}	0	19755026	43	3250584769	0	0	0	0	2024-03-01 11:39:04.284	2980406515838442447
{"data_load_ts_hour":474108}	0	11820983	24	1801821967	0	0	0	0	2024-03-01 11:39:04.284	2980406515838442447
{"data_load_ts_hour":474127}	0	3751415	8	546119138	0	0	0	0	2024-03-01 11:39:04.284	2980406515838442447
{"data_load_ts_hour":474126}	0	4094247	8	583096432	0	0	0	0	2024-03-01 11:39:04.284	2980406515838442447
{"data_load_ts_hour":474129}	0	4341823	8	647139274	0	0	0	0	2024-03-01 11:39:04.284	2980406515838442447
{"data_load_ts_hour":474128}	0	4645898	8	661700686	0	0	0	0	2024-03-01 11:39:04.284	2980406515838442447
{"data_load_ts_hour":474131}	0	7696352	16	1157927863	0	0	0	0	2024-03-01 11:39:04.284	2980406515838442447
{"data_load_ts_hour":474130}	0	5359994	11	782552958	0	0	0	0	2024-03-01 11:39:04.284	2980406515838442447

I wasn't able to reproduce the issue.

For merging position delete files I switched to multi-stage rewrite_data_files varying where clauses and delete-files-threshold.

For fresh partitions that have bigger possibility of update, I run rewrite_data_files {delete-files-threshold: 10}. For older partitions rewrite_data_files {delete-files-threshold: 1}.

Latter will merge all delete files into base files, while former will just merge those base files that has at least 10 delete files associated with this.

Can anybody clarify on this weird mapping of iceberg partitions? {"data_load_ts_hour":474117}?

bk-mz avatar Mar 01 '24 11:03 bk-mz

Just to give more overview, after something, iceberg has created a mirror of existing partitions in which it has put TONS of small files.

Example: we have "phantom" partition 474970 that corresponds to Fri Mar 08 2024 10:00:00 GMT+0000. Both folders are present on s3, but the odd one 474970 contains 2,5k files totallin 250mb.

So we see both partition with hour as plain text and with number like 474970.

rewrite_position_delete_files keeps processing those folder and rewriting datafiles to themselves, with each rewrite data is appended to "phantom" odd partitions, leading to table degradation.

bk-mz avatar Mar 11 '24 14:03 bk-mz

So the big question would what could break in iceberg so it would create those "phantom" and odd partitions?

bk-mz avatar Mar 11 '24 14:03 bk-mz

The path's aren't as important as the metadata, do the files have incorrect partition tuples (the values actually used for filtering)? The position delete writer here is most likely not converting the tuple into the helper string we usually use with hour transforms. My guess is this is just a bug in the paths though and that the tuples are probably fine.

RussellSpitzer avatar Mar 11 '24 15:03 RussellSpitzer

do the files have incorrect partition tuples (the values actually used for filtering)? how do I check this?

Something is odd, because I see bloated small-file partitions, i.e. each batch I run 3 consecutive compactions:

CALL glue.system.rewrite_position_delete_files(table => 'table', where => "data_load_ts between TIMESTAMP '2024-02-10 20:39:47.669' and TIMESTAMP '2024-03-11 19:39:47.669'", options => map('partial-progress.enabled', 'true', 'min-file-size-bytes', '26843545', 'max-file-size-bytes', '134217728', 'min-input-files', '10', 'max-concurrent-file-group-rewrites', '500'))
+----------------------------+------------------------+---------------------+-----------------+
|rewritten_delete_files_count|added_delete_files_count|rewritten_bytes_count|added_bytes_count|
+----------------------------+------------------------+---------------------+-----------------+
|693                         |689                     |10353631             |10322951         |
+----------------------------+------------------------+---------------------+-----------------+

CALL glue.system.rewrite_data_files(table => 'table', where => "data_load_ts between TIMESTAMP '2024-03-04 20:41:16.346' and TIMESTAMP '2024-03-11 19:41:16.346'", options => map('partial-progress.enabled', 'true', 'min-file-size-bytes', '53687091', 'max-file-size-bytes', '268435456', 'min-input-files', '20', 'max-concurrent-file-group-rewrites', '500'))
+--------------------------+----------------------+---------------------+-----------------------+
|rewritten_data_files_count|added_data_files_count|rewritten_bytes_count|failed_data_files_count|
+--------------------------+----------------------+---------------------+-----------------------+
|60                        |6                     |131249114            |0                      |
+--------------------------+----------------------+---------------------+-----------------------+

CALL glue.system.rewrite_data_files(table => 'table', where => "data_load_ts <= TIMESTAMP '2024-03-04 20:41:42.204'", options => map('partial-progress.enabled', 'true', 'min-file-size-bytes', '53687091', 'max-file-size-bytes', '268435456', 'min-input-files', '20', 'max-concurrent-file-group-rewrites', '1000'))
+--------------------------+----------------------+---------------------+-----------------------+
|rewritten_data_files_count|added_data_files_count|rewritten_bytes_count|failed_data_files_count|
+--------------------------+----------------------+---------------------+-----------------------+
|20                        |2                     |655906               |0                      |
+--------------------------+----------------------+---------------------+-----------------------+

(where clause changes per each batch).

Then, this is what I see in the compactions logs:

with data as (
select
    committed_at,
    snapshot_id,
    summary.`changed-partition-count` as changed_partition_count,
    if(summary.`added-data-files` is null, "compact_delete_files", "compact_base_files") as op,
    if(summary.`added-data-files` is null, summary.`added-position-delete-files`, summary.`added-data-files`) as added_files,
    if(summary.`added-data-files` is null, summary.`removed-position-delete-files`, summary.`deleted-data-files`) as removed_files
from table.snapshots
where operation = "replace")
select
    committed_at,
    snapshot_id,
    changed_partition_count,
    op,
    concat(removed_files, "->", added_files) as change,
    removed_files / added_files as compact_ratio
from data
limit 100;

Results:

2024-03-10 20:36:07.15	8797169961871736418	23	compact_delete_files	165->157	1.0509554140127388
2024-03-10 20:36:08.41	7845087824966754379	23	compact_delete_files	263->249	1.0562248995983936
2024-03-10 20:36:09.532	7903647840988554268	23	compact_delete_files	634->616	1.0292207792207793
2024-03-10 20:36:10.764	483626480370270807	23	compact_delete_files	835->814	1.0257985257985258
2024-03-10 20:36:12.728	5477859694032525431	23	compact_delete_files	441->426	1.0352112676056338
2024-03-10 20:36:16.581	1255334267473732600	23	compact_delete_files	803->784	1.024234693877551
2024-03-10 20:36:18.947	7227297553321373728	23	compact_delete_files	505->478	1.0564853556485356
2024-03-10 20:36:21.98	1281338329182940375	23	compact_delete_files	582->566	1.028268551236749
2024-03-10 20:36:23.08	6410347697455606449	23	compact_delete_files	353->317	1.113564668769716
2024-03-10 20:36:24.808	7588875997900067709	20	compact_delete_files	599->588	1.0187074829931972
2024-03-10 20:40:05.974	198476720206951232	2	compact_base_files	40->3	13.333333333333334
2024-03-10 20:40:08.515	7210798375839628837	2	compact_base_files	42->4	10.5
2024-03-10 20:40:10.876	5692616137794405663	2	compact_base_files	41->4	10.25
2024-03-10 20:40:13.437	6290596725370482099	2	compact_base_files	40->4	10.0
2024-03-10 20:40:16.507	217541509135133596	2	compact_base_files	42->4	10.5
2024-03-10 20:40:20.011	4802230468835188293	1	compact_base_files	20->2	10.0
2024-03-10 20:41:04.273	6246580026217410227	23	compact_delete_files	166->155	1.070967741935484
2024-03-10 20:41:05.719	3297695269624318303	23	compact_delete_files	264->239	1.104602510460251
2024-03-10 20:41:06.785	5847626578154331062	23	compact_delete_files	666->654	1.018348623853211
2024-03-10 20:41:08.07	4267888321478088912	23	compact_delete_files	804->778	1.0334190231362468
2024-03-10 20:41:10.013	8135762954708433780	23	compact_delete_files	437->420	1.0404761904761906
2024-03-10 20:41:14.315	2664190367967793873	23	compact_delete_files	845->828	1.0205314009661837
2024-03-10 20:41:16.688	3946211637382110477	23	compact_delete_files	365->343	1.064139941690962
2024-03-10 20:41:18.796	4932807097664769635	23	compact_delete_files	546->511	1.0684931506849316
2024-03-10 20:41:19.909	6618458420791632675	23	compact_delete_files	448->439	1.020501138952164
2024-03-10 20:41:21.741	6703590499156726364	19	compact_delete_files	609->585	1.041025641025641
2024-03-10 20:45:05.725	9121859527157436009	2	compact_base_files	40->3	13.333333333333334
2024-03-10 20:45:08.159	4157493376091722749	2	compact_base_files	40->4	10.0
2024-03-10 20:45:10.364	7307518553901315194	2	compact_base_files	40->4	10.0
2024-03-10 20:45:12.53	7391382820317167039	2	compact_base_files	40->4	10.0
2024-03-10 20:45:15.294	1634346428437060541	2	compact_base_files	40->4	10.0
2024-03-10 20:45:18.664	6471891107683048438	1	compact_base_files	20->2	10.0
2024-03-10 20:45:23.872	2705895529971003001	1	compact_base_files	20->2	10.0
2024-03-10 20:46:09.038	3459810320855697041	23	compact_delete_files	182->170	1.0705882352941176
2024-03-10 20:46:10.458	2661568600570547866	23	compact_delete_files	454->430	1.0558139534883721
2024-03-10 20:46:11.621	1341779779088511826	23	compact_delete_files	577->564	1.0230496453900708
2024-03-10 20:46:12.949	1453266135319995705	23	compact_delete_files	859->835	1.02874251497006
2024-03-10 20:46:14.829	4995825204442784024	23	compact_delete_files	277->259	1.0694980694980696
2024-03-10 20:46:19.456	6932435756418096621	23	compact_delete_files	815->786	1.0368956743002544
2024-03-10 20:46:21.472	2430383133564740551	23	compact_delete_files	337->325	1.0369230769230768
2024-03-10 20:46:23.548	3455842334691225549	23	compact_delete_files	545->528	1.0321969696969697
2024-03-10 20:46:24.884	6197136702470865431	23	compact_delete_files	417->405	1.0296296296296297
2024-03-10 20:46:26.57	3018837951760552054	19	compact_delete_files	658->639	1.029733959311424
2024-03-10 20:50:05.474	5737241757182222621	2	compact_base_files	40->4	10.0

compact_delete_files actually do nothing, i.e. rewrite data into itself, or even produce more files than needed.

Are there any obvious places to check for inconsistency in setup or setting?

bk-mz avatar Mar 11 '24 20:03 bk-mz

I.e. image

Odd folder 475031 -> 1710111600 -> Sun Mar 10 2024 23:00:00 GMT+0000

image

select file_path, file_size_in_bytes from table.files where partition.data_load_ts_hour = 475031;
s3://table/data/data_load_ts_hour=2024-03-10-23/00013-3509493-318c11b3-8524-44a5-9666-d83eb56eb629-00002.parquet	45733
s3://table/data/data_load_ts_hour=2024-03-10-23/00015-3445167-69efcc48-af68-457c-b04c-8098bf1d5075-00001.parquet	49560
s3://table/data/data_load_ts_hour=2024-03-10-23/00016-3380992-f13b8421-04b8-4066-808d-65eaeb3f43a0-00011.parquet	43586
s3://table/data/data_load_ts_hour=2024-03-10-23/00016-3319607-da37d93f-19e2-4299-bfe8-c39379609412-00002.parquet	48747
s3://table/data/data_load_ts_hour=2024-03-10-23/00015-3260964-cdef4129-3da2-478e-a909-a12324fb05ff-00006.parquet	46419
s3://table/data/data_load_ts_hour=2024-03-10-23/00016-3197026-1ff4d8e1-cc34-4d90-bc49-9a9c2c3a2006-00003.parquet	48212
s3://table/data/data_load_ts_hour=2024-03-10-23/00016-3141354-5a0785fa-1242-4417-bd2d-52783f0a4962-00001.parquet	51474
s3://table/data/data_load_ts_hour=2024-03-10-23/00015-3083005-3442d952-b4c5-48fe-954e-e06b11f2366f-00002.parquet	46876
s3://table/data/data_load_ts_hour=2024-03-10-23/00000-615765-fc9949b0-e3e0-4fca-82b6-37584104f369-00001.parquet	72581629
s3://table/data/data_load_ts_hour=2024-03-10-23/00071-52955-2950793c-8099-475f-8d39-f12e51fbe384-00001.parquet	102451428
s3://table/data/data_load_ts_hour=2024-03-10-23/00072-52985-2950793c-8099-475f-8d39-f12e51fbe384-00001.parquet	102579065
s3://table/data/data_load_ts_hour=2024-03-10-23/00073-52976-2950793c-8099-475f-8d39-f12e51fbe384-00001.parquet	102562792
s3://table/data/data_load_ts_hour=2024-03-10-23/00074-52983-2950793c-8099-475f-8d39-f12e51fbe384-00001.parquet	102523561
s3://table/data/data_load_ts_hour=2024-03-10-23/00075-52990-2950793c-8099-475f-8d39-f12e51fbe384-00001.parquet	102416806
s3://table/data/data_load_ts_hour=2024-03-10-23/00076-52978-2950793c-8099-475f-8d39-f12e51fbe384-00001.parquet	102503807
s3://table/data/data_load_ts_hour=2024-03-10-23/00077-52980-2950793c-8099-475f-8d39-f12e51fbe384-00001.parquet	102752978
s3://table/data/data_load_ts_hour=2024-03-10-23/00078-52993-2950793c-8099-475f-8d39-f12e51fbe384-00001.parquet	102571840
s3://table/data/data_load_ts_hour=2024-03-10-23/00079-52987-2950793c-8099-475f-8d39-f12e51fbe384-00001.parquet	102606616
s3://table/data/data_load_ts_hour=2024-03-10-23/00080-52943-2950793c-8099-475f-8d39-f12e51fbe384-00001.parquet	102949360
s3://table/data/data_load_ts_hour=2024-03-10-23/00081-52982-2950793c-8099-475f-8d39-f12e51fbe384-00001.parquet	102413581
s3://table/data/data_load_ts_hour=2024-03-10-23/00082-52981-2950793c-8099-475f-8d39-f12e51fbe384-00001.parquet	102433660
s3://table/data/data_load_ts_hour=2024-03-10-23/00083-52964-2950793c-8099-475f-8d39-f12e51fbe384-00001.parquet	102382969
s3://table/data/data_load_ts_hour=2024-03-10-23/00084-52967-2950793c-8099-475f-8d39-f12e51fbe384-00001.parquet	102499015
s3://table/data/data_load_ts_hour=2024-03-10-23/00085-52986-2950793c-8099-475f-8d39-f12e51fbe384-00001.parquet	102645047
s3://table/data/data_load_ts_hour=2024-03-10-23/00086-52992-2950793c-8099-475f-8d39-f12e51fbe384-00001.parquet	102409434
s3://table/data/data_load_ts_hour=2024-03-10-23/00087-52975-2950793c-8099-475f-8d39-f12e51fbe384-00001.parquet	102292344
s3://table/data/data_load_ts_hour=2024-03-10-23/00088-52988-2950793c-8099-475f-8d39-f12e51fbe384-00001.parquet	102571193
s3://table/data/data_load_ts_hour=2024-03-10-23/00089-52984-2950793c-8099-475f-8d39-f12e51fbe384-00002.parquet	102692273
s3://table/data/data_load_ts_hour=2024-03-10-23/00090-52989-2950793c-8099-475f-8d39-f12e51fbe384-00001.parquet	102362866
s3://table/data/data_load_ts_hour=2024-03-10-23/00091-52991-2950793c-8099-475f-8d39-f12e51fbe384-00001.parquet	102582641
s3://table/data/data_load_ts_hour=2024-03-10-23/00092-52994-2950793c-8099-475f-8d39-f12e51fbe384-00001.parquet	102684825
s3://table/data/data_load_ts_hour=2024-03-10-23/00014-2780212-cbae87b5-972a-4502-a503-7af1aa2fa345-00001.parquet	52428
s3://table/data/data_load_ts_hour=2024-03-10-23/00000-2713285-5621343d-59e6-4fb2-94eb-3abb650807e8-00001.parquet	558576
s3://table/data/data_load_ts_hour=2024-03-10-23/00001-2713286-5621343d-59e6-4fb2-94eb-3abb650807e8-00001.parquet	736579
s3://table/data/data_load_ts_hour=2024-03-10-23/00015-2895320-64cf8ddc-50c9-4751-a682-42bfc866da26-00002.parquet	48850
s3://table/data/data_load_ts_hour=2024-03-10-23/00014-2718792-ef600d59-b559-41dd-a718-1cdccad2d2af-00003.parquet	45186
s3://table/data/data_load_ts_hour=2024-03-10-23/00015-2954371-35a30ce1-f52b-463a-a987-d0076a407294-00002.parquet	43511
s3://table/data/data_load_ts_hour=2024-03-10-23/00016-2836402-586be539-99b1-4471-a54b-456da1c4c757-00002.parquet	47107
s3://table/data/data_load_ts_hour=2024-03-10-23/00015-3013381-b6a11670-a567-4cf7-87bf-434d95fd50cc-00001.parquet	51385
s3://table/data/data_load_ts_hour=475031/00089-3512827-3d8f11a9-2801-434d-b7cb-e1c51d2be2e1-00001-deletes.parquet	6843
s3://table/data/data_load_ts_hour=475031/00107-3513237-3d8f11a9-2801-434d-b7cb-e1c51d2be2e1-00001-deletes.parquet	18330
s3://table/data/data_load_ts_hour=475031/00264-3516565-3d8f11a9-2801-434d-b7cb-e1c51d2be2e1-00001-deletes.parquet	1813
s3://table/data/data_load_ts_hour=475031/00366-3518557-3d8f11a9-2801-434d-b7cb-e1c51d2be2e1-00001-deletes.parquet	1771
s3://table/data/data_load_ts_hour=475031/00434-3519930-3d8f11a9-2801-434d-b7cb-e1c51d2be2e1-00001-deletes.parquet	18304
s3://table/data/data_load_ts_hour=475031/00484-3520949-3d8f11a9-2801-434d-b7cb-e1c51d2be2e1-00001-deletes.parquet	18567
s3://table/data/data_load_ts_hour=475031/00605-3523428-3d8f11a9-2801-434d-b7cb-e1c51d2be2e1-00001-deletes.parquet	19055
s3://table/data/data_load_ts_hour=475031/00609-3523522-3d8f11a9-2801-434d-b7cb-e1c51d2be2e1-00001-deletes.parquet	18045
s3://table/data/data_load_ts_hour=475031/00840-3528509-3d8f11a9-2801-434d-b7cb-e1c51d2be2e1-00001-deletes.parquet	18473
s3://table/data/data_load_ts_hour=475031/00936-3530655-3d8f11a9-2801-434d-b7cb-e1c51d2be2e1-00001-deletes.parquet	18516
s3://table/data/data_load_ts_hour=475031/00947-3530931-3d8f11a9-2801-434d-b7cb-e1c51d2be2e1-00001-deletes.parquet	18314
s3://table/data/data_load_ts_hour=475031/00974-3531559-3d8f11a9-2801-434d-b7cb-e1c51d2be2e1-00001-deletes.parquet	18203
s3://table/data/data_load_ts_hour=475031/01155-3535492-3d8f11a9-2801-434d-b7cb-e1c51d2be2e1-00001-deletes.parquet	17941
s3://table/data/data_load_ts_hour=475031/01175-3535917-3d8f11a9-2801-434d-b7cb-e1c51d2be2e1-00001-deletes.parquet	1771
s3://table/data/data_load_ts_hour=475031/01275-3538024-3d8f11a9-2801-434d-b7cb-e1c51d2be2e1-00001-deletes.parquet	18253
s3://table/data/data_load_ts_hour=475031/01341-3539357-3d8f11a9-2801-434d-b7cb-e1c51d2be2e1-00001-deletes.parquet	17981
s3://table/data/data_load_ts_hour=475031/01434-3541342-3d8f11a9-2801-434d-b7cb-e1c51d2be2e1-00001-deletes.parquet	1813
s3://table/data/data_load_ts_hour=475031/01535-3543442-3d8f11a9-2801-434d-b7cb-e1c51d2be2e1-00001-deletes.parquet	18564
s3://table/data/data_load_ts_hour=475031/01610-3544978-3d8f11a9-2801-434d-b7cb-e1c51d2be2e1-00001-deletes.parquet	18476
s3://table/data/data_load_ts_hour=475031/01753-3547842-3d8f11a9-2801-434d-b7cb-e1c51d2be2e1-00001-deletes.parquet	1770
s3://table/data/data_load_ts_hour=475031/01777-3548348-3d8f11a9-2801-434d-b7cb-e1c51d2be2e1-00001-deletes.parquet	18273
s3://table/data/data_load_ts_hour=475031/01931-3551544-3d8f11a9-2801-434d-b7cb-e1c51d2be2e1-00001-deletes.parquet	18869
s3://table/data/data_load_ts_hour=475031/02062-3554342-3d8f11a9-2801-434d-b7cb-e1c51d2be2e1-00001-deletes.parquet	1771
s3://table/data/data_load_ts_hour=475031/02132-3555802-3d8f11a9-2801-434d-b7cb-e1c51d2be2e1-00001-deletes.parquet	18528
s3://table/data/data_load_ts_hour=475031/02143-3556044-3d8f11a9-2801-434d-b7cb-e1c51d2be2e1-00001-deletes.parquet	17974
s3://table/data/data_load_ts_hour=475031/02206-3557310-3d8f11a9-2801-434d-b7cb-e1c51d2be2e1-00001-deletes.parquet	18243
s3://table/data/data_load_ts_hour=475031/02214-3557488-3d8f11a9-2801-434d-b7cb-e1c51d2be2e1-00001-deletes.parquet	18202
s3://table/data/data_load_ts_hour=475031/02245-3558183-3d8f11a9-2801-434d-b7cb-e1c51d2be2e1-00001-deletes.parquet	18995
s3://table/data/data_load_ts_hour=475031/02420-3561823-3d8f11a9-2801-434d-b7cb-e1c51d2be2e1-00001-deletes.parquet	1771
s3://table/data/data_load_ts_hour=475031/02618-3565614-3d8f11a9-2801-434d-b7cb-e1c51d2be2e1-00001-deletes.parquet	18271
s3://table/data/data_load_ts_hour=475031/02689-3566578-3d8f11a9-2801-434d-b7cb-e1c51d2be2e1-00001-deletes.parquet	1771
Time taken: 7.393 seconds, Fetched 70 row(s)

So looks like those 2 partitions are connected somehow, though those 2k files are not participating.

bk-mz avatar Mar 11 '24 21:03 bk-mz

This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. To permanently prevent this issue from being considered stale, add the label 'not-stale', but commenting on the issue is preferred when possible.

github-actions[bot] avatar Oct 21 '24 00:10 github-actions[bot]

This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale'

github-actions[bot] avatar Nov 05 '24 00:11 github-actions[bot]