Load empty table failed.
Is there an existing issue for this?
- [X] I have searched the existing issues
Description of the bug
When try to load an empty hudi table, will paniked with the following error.
Failed to resolve the latest schema: no file path found
Steps To Reproduce
the empty table will reproduce.
Expected behavior
just return empty recordes
Screenshots / Logs
No response
Software information
not related
Additional context
No response
I think, it's mainly because the get_latest_schema() function, which try to load the schema from the latest parquet file, but actually there doesn't have any base file now.
@gohalo do you want to take this up? i had some similar fix in https://github.com/Eventual-Inc/Daft/pull/2268/files . see if you can follow similar logic and apply it here.
@gohalo we actually have a test case timeline_read_latest_schema_from_empty_table. can you look into this and see what is not covered, and fix accordingly?
@xushiyan i will try to fix that later 😀
hi @gohalo any update on this? trying to get this included soon in the next release
@gohalo we actually have a test case
timeline_read_latest_schema_from_empty_table. can you look into this and see what is not covered, and fix accordingly?
@xushiyan
Actually the result is same with the test case timeline_read_latest_schema_from_empty_table, it's just return some errors described before.
https://github.com/apache/hudi-rs/blob/5e1981f0380ef43f7fab4eb2229820c82c717e29/crates/core/src/table/timeline.rs#L243-L258
Change the following https://github.com/apache/hudi-rs/blob/5e1981f0380ef43f7fab4eb2229820c82c717e29/crates/core/src/table/mod.rs#L144-L145 to
.await?;
will got the detail error message which is same.
Failed to resolve the latest schema: no file path found
I'm trying to load the schema from the hoodies.table.create.scheam field of hoodie.properties, but I found we should to support parse java properties file, and not got a simple crates now.
And if we try to support loading properties file, maybe a litte different with the global config file. Or we could parse them both in properties file format.
Still wander if this is the right solution.
@gohalo We can't load hoodie.table.create.schema for this api as it's not always available and it could get obsolete when table evolves.
We currently expect the api to return Error when user tries to get schema from an empty table. I was curious which code path you are getting panic, because by right we should always get a Result which can be an Error. Can you clarify how did you get the panic!() ? And fix accordingly?
The panic is because of unhandled Result, with await? or unwrap().
Just think it's ok for empty result instead of some error, which act different like spark or flink. As you said, maybe we could solve with schema evolution feature.
I was asking about which code path caused panic. Spotted that the datafusion api wasn't handling it now it's fixed.