iceberg-rust icon indicating copy to clipboard operation
iceberg-rust copied to clipboard

[Discussion] Reduce the need for `iceberg-rust` forks / how can we grow the community?

Open alamb opened this issue 3 months ago • 26 comments

First of all, I hope it goes without saying thank you so much for all your work on this crate 🙏

Problem Statement

Recently, I have heard from multiple people and projects that they are maintaining (or considering maintaining) forks or reimplementations of iceberg-rust because iceberg-rust is not yet suitable for them for some reason.

I think the speaks to a great demand for iceberg functionality in Rust. Having multiple different forks / implementations is of course an option, but I also believe it means the same amount of effort will be split across multiple projects, meaning the available capacity for each will inevitably be smaller than if our efforts were combined somehow

Selfishly, from my DataFusion 🎩 perspective, many projects are interested (understandably) in building systems that use DataFusion to read data stored in Apache Iceberg. I feel that the lack of a super-easy-to-use Iceberg integration crate is holding many of these projects back

Unfortunately I don't have a list of things that would allow these projects to use iceberg-rust, but I hope to generate that list using this ticket.

Ideas to fix:

As I am only relaying this need second hand I don't have many specific ideas to share, and I hope people with firsthand challenges will comment directly on this issue.

One strong strong suspicion is looking at the project commits is that it might help to get some more capacity for reviews / docs / whatever (it appears that @liurenjie1024 is doing the bulk of reviews recently, and @Xuanwo has been helping too)

As always in open source projects, I think the most productive conversation is NOT "people asking the maintainers for things" but rather "how can the people asking for things also contribute to the solution"

Call to Action

Thus I would like to ask:

To people considering a fork:

  1. What specific challenges are you facing that means you can't use iceberg-rust
  2. What, if anything, are you willing to do / contribute to help get iceberg-rust to the point where you could use it directly

To the existing maintainers:

  1. What would help you the most maintaining / moving this crate forward? How can we help?

alamb avatar Oct 28 '25 17:10 alamb

Thanks for bringing this up! I’m looking forward to hearing more feedback from our users about what they want. I strongly agree that a split community can be a big loss to all of us, so I really want to avoid forking becoming the final decision. If there are anythings I can do, please let know.


What would help you the most maintaining / moving this crate forward? How can we help?

The most limited resource for us is review. If experienced developers can help review PRs, it would be really helpful!

Xuanwo avatar Oct 28 '25 17:10 Xuanwo

Thanks for bringing this up @alamb, as I have recently taken great interest in iceberg-rust (#1749). My Comet PR just today passed all Iceberg Java tests via iceberg-rust, so I think that my POC Comet PR will eventually get merged as an experimental feature.

In my short time focusing on this repo:

  1. I have found the reviews to be high quality and the community very welcoming
  2. It seems in need of reviewers, so I am trying to spend some time giving review feedback (as a non-committer)
  3. I have concerns about how iceberg-rust will track DataFusion and Arrow-rs releases that might make it tough for Comet to keep up with DF, Arrow-rs, and possibly iceberg-rust. However, there is discussion in the Slack on how to better keep up in the future. This is one area I am happy to help contribute/review PRs for since I'm used to bumping DF and Arrow-rs for Comet

I am really exciting about this repo, and the opportunity from my perspective to rapidly mature the ArrowReader via Comet and Iceberg Java tests.

What am I willing to do?

  • Review as I find cycles.
  • Submit patches, mostly related to ArrowReader at the moment.

mbutrovich avatar Oct 28 '25 18:10 mbutrovich

I will share my thoughts as someone who has used iceberg-rust

Current Challenges

One complaint I have is that the Rust implementation relies on the Java implementation as a reference; as opposed to just implementing the spec in Rust. As a result there are a lot of legacy constructs in the iceberg-rust crate that are (in my view) java-flavored; so trying to wield the crate within a Rust project isn't ergonomic at all, and because of that its easy to shoot yourself in the foot.

There are a lot of complexities to iceberg. The metadata serialization, handling, and validation planes are very tightly coupled in iceberg-rust. As an example trying to implement something related to serialization would then require you to understand the handling & validation as well. Not bad in a vacuum, but because the Java implementation serves as the guiding reference - I need to then look at Iceberg java code as a Rust dev in order to understand the context for why things are done the way they are in iceberg-rust, whilst also trying to understand the actual underlying behavior of the implementation. This creates friction for new contributors that I frankly just don't think should be there.

What am I willing to do?

I'm not sure. Personally the biggest roadblock to my ability to contribute is the community agreeing that the above is even a problem. I am sure the core maintainers have valid reasons for using the Java implementation as a reference, but it's something that does not make sense to me AT ALL considering that the entire purpose of having an iceberg specification is to allow implementations to be agnostic in how they uphold it's requirements. In my view there are several fundamental problems with the crate as a result of using the Java impl as the reference.

I would also like to point out that I lack a lot of historical/contextual iceberg knowledge; so while I may view the above as a problem I accept that the community might not agree with me for their own valid reasons.

Sl1mb0 avatar Oct 28 '25 19:10 Sl1mb0

Thank you for starting this conversation! I’d love to explore ways we can grow the community together. I’m happy to help in any way I can.

As a contributor to Iceberg

I’ve noticed that PR review can sometimes be a bottleneck for the project. I try to help where possible, but I’d definitely encourage more folks to get involved. For a fast-moving project like iceberg-rust, a more robust test suite could help accelerate reviews. In pyiceberg, we have a solid integration suite with Spark that might be worth bringing over. This could improve review cadence and give everyone more confidence to make forward progress.

Personally, I’m still learning Rust, so there are some language-related challenges for me when contributing to iceberg-rust. However, I’m happy to provide context from the Iceberg side whenever needed.

From a release perspective, we’ve invested a lot of effort in recent releases to automate and streamline the process, as demonstrated in the last release.

I’ll also repost this on the Iceberg dev list to get more visibility. EDIT: https://lists.apache.org/thread/6ovdxgb3dmyq8rwk7kdl5lsdgd1xl9fz

Would love to hear other perspectives from the community!

kevinjqliu avatar Oct 28 '25 19:10 kevinjqliu

I agree with @Sl1mb0 in that many abstractions feel like they are taken from the Java implementation and don't feel like idiomatic rust. The crate uses many classes where I think lightweight functions would suffice. This was one of the reasons why I stopped contributing to the repo some time ago.

And then there is the interoperability with the Datafusion/Arrow/ObjectStore ecosystem. The crate necessarily duplicates datafusion functionality like expressions, file pruning, parquet reading & writing. This makes total sense for a general library. But if you want to use Datafusion, Datafusion currently just provides more feature complete implementations.

JanKaul avatar Oct 28 '25 20:10 JanKaul

Hi, thanks for raising this. We maintain a fork at https://github.com/bauplanlabs/iceberg-rust, but only to stay ahead of patches being merged. The other fork I've seen used is RisingWave's, which I think is in a similar position (although they are more diverged).

We use iceberg-rust in production in concert with datafusion, but we don't use the IcebergTableProvider directly, even though we would really like to. Instead, we use iceberg-rust just for fetching/pruning the manifest lists and then use DataFusion directly. This is awkward, and error prone, and we'd really like to avoid a hack like that. I think there are three particurlarly low-hanging fruit that would really make using IcebergTableProvider feasible for us:

  • Fixing the deadlock(s) in the read path: in particular this PR seems excellent and it has gotten zero attention: https://github.com/apache/iceberg-rust/pull/1486
  • Output partitioning: as I raised in slack, reading from IcebergTableProvider limits you to a single thread, which is obviously not very useful. Here are some benchmarks I created to demonstrate the issue. This issue was closed as not planned. I don't understand the issue 100%, so I hope I'm not misconstruing anything.
  • Taking advantage of DataFusion's parquet optimizations: unless I missed something, iceberg-rust doesn't use DataSourceExec/ParquetSource, which means we would automatically lose out on a lot of parquet optimizations already landed or being landed in DataFusion, like metadata caching. I don't understand if that's intended or not. Again, it's possible I'm misunderstanding, apologies if so.

We're more than willing to contribute fixes and features, and we have already, but the ones above are pretty intimidating for me to tackle without any help.

colinmarc avatar Oct 28 '25 20:10 colinmarc

One complaint I have is that the Rust implementation relies on the Java implementation as a reference; as opposed to just implementing the spec in Rust. As a result there are a lot of legacy constructs in the iceberg-rust crate that are (in my view) java-flavored; so trying to wield the crate within a Rust project isn't ergonomic at all, and because of that its easy to shoot yourself in the foot.

I'll second this - as a minor point, but something that would definitely help drive adoption were it fixed.

colinmarc avatar Oct 28 '25 20:10 colinmarc

I agree with @Sl1mb0 in that many abstractions feel like they are taken from the Java implementation and don't feel like idiomatic rust. The crate uses many classes where I think lightweight functions would suffice. This was one of the reasons why I stopped contributing to the repo some time ago.

I think this is true of a lot of Rust projects particularly the early days of a “<blah> but in Rust” project. Some of that is due to existing reference implementations, some of that is due to there just being fewer Rust devs and they might be learning on these projects. I’m not dismissing your concerns at all and it’s great to raise them, just affirming that I definitely heard DataFusion described as “Rust written like C++.” We should strive to listen to the Rust community (I am still learning here) about what sort of idiomatic Rust gaps exist in the library, and encourage folks to fix them.

mbutrovich avatar Oct 28 '25 21:10 mbutrovich

I think rust ecosystem needs something like DuckLake, may be FusionLake? Last time I looked into IcebergTableProvider it looked like building a mini-query-engine or even mini-DBMS within the IcebergTableProvider itself. Is this smart? Those manifestlists and manifests and puffins and json files, it is a lot of transactionally mutable data, is it really smart to work with them as just plain files? Especially by DBMS folks? Every time populate those rust collections, do something, then discard everything per each query? Or develop some sort of ORM cache just for iceberg metadata? Then there is all that advanced functionality like branching operations which only works with Spark. Does rust need to implement it? This is the real complexity!

Given that DataFusion already can natively query avro and json, perhaps DataFusion itself can be used to produce a scan plan with advanced functionality with most logic encoded in SQL or in hand-crafted logical plan and not directly in rust? This will be inline with "canonical iceberg implementation" but going further along with Ducklake (and also Snowflake and Databricks implementation), perhaps DataFusion can use SQLite or Turso or any other OLTP DBMS to maintain its metadata, Iceberg or non-Iceberg, and then have it efficiently dump/ingest to/from canonical Iceberg metadata files and then perhaps even provide IRC out of the box (with the help of LakeKeeker for example) and federation with other IRCs. This will be a full Iceberg implementation on par with "Snowbricks duo".

The thing with Iceberg is to separate "iceberg as data interchange format" where each transaction needs to generate those metadata files on S3 as part of each commit and "iceberg as internal data format" where for most or all transactions no other engine needs to access the data and it is wasteful to generate and regenerate all those numerous metadata files and then run housekeeping to remove them where the only engine accessing the data uses its own metastore.

camuel avatar Oct 29 '25 00:10 camuel

@camuel I think that's a bit beyond the scope of what DataFusion as a project wants to pick up and would fragment the ecosystem even more. But I do get your point that Iceberg can be complex, etc. Something that is likely much lower lift and would be a fun project would be to build a DuckLake + DataFusion system where there's a DuckLakeSchemaCatalog type thing so that DuckLake / DuckDB can control the schema catalog and table scans but DataFusion does the rest.

adriangb avatar Oct 29 '25 02:10 adriangb

I agree with many of the issues voiced by @Sl1mb0 and @JanKaul RE: the Rust/Java abstractions and the lack of integration into DataFusion/Arrow/ObjectStore ecosystem. Similar to @colinmarc, we also maintain a fork at https://github.com/spiceai/iceberg-rust, where we mostly apply some of our own patches that we've submitted as PRs (1297, 917, 1673) but haven't yet been merged.

Our ideal state is not using the IcebergTableProvider that is provided by this project out of the box. We went through a similar exercise with delta-rs and found the maintenance burden of coordinating DataFusion versions to delta-rs versions with our usage of DataFusion to be very difficult. We really like the approach that the delta-kernel-rs team took with providing a good set of primitives that can be used during planning, which we then use to hook into the advanced Parquet reading capabilities that DataFusion has (i.e. ParquetExec, ParquetAccessPlan, object_store, etc).

So our wishlist would be:

  • A "kernel" (similar to what delta-kernel does) that separates the planning from execution and makes it easy to integrate into a custom query engine.
  • Allow using object_store for the kernel IO (ref: https://github.com/apache/iceberg-rust/issues/172) instead of OpenDAL, since we are already heavily invested in it.
  • A "reference" implementation of using the kernel (i.e. it could be IcebergTableProvider, but maybe just an example) that shows how to separate the planning of which files to read (and which rows to mask) with a deep integration into the DataFusion ParquetExec machinery. I think its fine to leave the IcebergTableProvider as a "batteries-included" provider that does everything using OpenDAL, as long as we had the primitives above.

If there was appetite to take the project in more of this direction, we would definitely be interested in contributing.

phillipleblanc avatar Oct 29 '25 04:10 phillipleblanc

@camuel I think that's a bit beyond the scope of what DataFusion as a project wants to pick up and would fragment the ecosystem even more. But I do get your point that Iceberg can be complex, etc. Something that is likely much lower lift and would be a fun project would be to build a DuckLake + DataFusion system where there's a DuckLakeSchemaCatalog type thing so that DuckLake / DuckDB can control the schema catalog and table scans but DataFusion does the rest.

Thanks @adriangb for the elaboration. That would be a fun project for sure and I think the utility could be in making DuckLake extensible in rust something not possible with Ducklake. May be it isn't too much effort just reimplement Ducklake in rust, not necessarily as part of DataFusion project but using DataFusion in the way Ducklake uses DuckDB, with a roundtrip metadata only interop, it is a full iceberg implementation, not a canonical one but still a fully compatible one. The bulk of the scan planning in DuckLake is done by SQL statements, not by DuckDB C++ code anyway.

camuel avatar Oct 29 '25 04:10 camuel

One strong strong suspicion is looking at the project commits is that it might help to get some more capacity for reviews / docs / whatever (it appears that @liurenjie1024 is doing the bulk of reviews recently, and @Xuanwo has been helping too)

There is indeed a need for more review capacity. The lack of review resources has slowed down progress on some issues. In this situation, forking might be a faster way to see results.

Thanks for bringing this up! I’m looking forward to hearing more feedback from our users about what they want. I strongly agree that a split community can be a big loss to all of us, so I really want to avoid forking becoming the final decision. If there are anythings I can do, please let know.

What would help you the most maintaining / moving this crate forward? How can we help?

The most limited resource for us is review. If experienced developers can help review PRs, it would be really helpful!

+1. There is a need for more review capacity. The lack of review resources has slowed down progress on some feature request. In this situation, forking might be a faster way to see results.

ZENOTME avatar Oct 29 '25 05:10 ZENOTME

I am currently contributing to iceberg-go and have contributed to iceberg-rust and its ecosystem a while ago. I was also contributing to an iceberg-rust fork. I've been mostly on the metadata side of things, having been involved in multiple catalog efforts.

I think the reason why we use iceberg-java as the source of truth alongside the spec is that the spec is underspecified in some places, or iceberg-java's validations went beyond the spec in others, for example, these iceberg-go PRs: https://github.com/apache/iceberg-go/pull/575#discussion_r2370317519, https://github.com/apache/iceberg-go/pull/605#issuecomment-3448701495.

Ultimately, iceberg-java has the widest adoption and is the core of the iceberg integration of spark, trino, etc. A lot of users of iceberg-rust / iceberg-go will come from these and have tables written by those engines.

twuebi avatar Oct 29 '25 11:10 twuebi

@alamb I am willing to help write and review PRs.

What specific challenges are you facing that means you can't use iceberg-rust The SDK is incomplete at the moment. So while I use the Rust SDK, I have to supplement it with Java SDK. Issues get started but take a long time to finish to it's hard to depend on it -- for example, https://github.com/apache/iceberg-rust/issues/1236

What, if anything, are you willing to do / contribute to help get iceberg-rust to the point where you could use it directly I am willing to help write and review PRs. I also want to focus on correctness of the code so let me know how we want to proceed.

gsoundar avatar Oct 29 '25 17:10 gsoundar

There are few competing rust implementations, would it make sense to consolidate efforts as well ?

milenkovicm avatar Oct 29 '25 17:10 milenkovicm

Another way to drive adoption is make a well defined Python binding. It will drive usage and therefore contributors to the project. In delta-rs community there are plenty of contributors adding features and functionality because they want to use it in the Python world.

Cc @roeap for the kernel discussion above!

ion-elgreco avatar Oct 30 '25 07:10 ion-elgreco

We just maintain a very leightweight fork for Lakekeeper to be able to move quicker with unmerged / unreleased PRs. We also require access to some methods considered internal, such as building unsafe TableMetadata from its parts or access to ManifestEntry.inherit_data.

My biggest painpoint with iceberg-rust is its FileIO not beeing Trait-based and not returning specific errors (i.e. a matchable file not found). In FileIO not all auth mechanisms are supported, and I don't think it is useful to re-implement all different IRSA / IMDS mechanisms of all hyperscalers. We are also missing refresh mechanisms for very long running tasks. Because of this, at Lakekeeper, we are not using Iceberg Rust FIleIO at all but instead use the hyperscaler frameworks behind a trait. Exactly this trait based approach is discussed for iceberg-rust as well in this proposal

Finally I will try to do more reviews again in the future.

c-thiel avatar Oct 30 '25 08:10 c-thiel

We recently forked at https://github.com/RelationalAI/iceberg-rust/. We plan to merge changes that we make upstream, but we consider that best done outside of the critical path for us, hence the fork. The gap in features that we need is the incremental changelog scan. However, I anticipate that the biggest obstacle to merging that upstream is that it might not look exactly like the Spark changelog, hence I second the comments about reliance on Java implementation.

vustef avatar Oct 30 '25 09:10 vustef

Thanks @alamb for raising this. As one of the authors and maintainers, it's very glad to see so much interest in this project. I absolutely agree with you that we should reduce the need of forks and grow this community. This is why we(me, @Xuanwo , @JanKaul ) started this project in apache repo rather than maintaining our own forks, we believe this will unit contributors around the world to bring iceberg to rust ecosystem.

To the existing maintainers:

  1. What would help you the most maintaining / moving this crate forward? How can we help?

As a maintainer, I would say currently what we need is reviewers. You can see that there are a lot of pending prs waiting for review. Also, after doing a lot of reviews, I also have some suggestions for contributors to help us move faster:

  1. Split your prs into small ones. Smaller prs is typically easier to review, which means it would be easier to got merged. This helps us to move fast. As a guideline, prs should be less than 500 addtions(excluding Cargo.lock).
  2. Add description to your pr. This helps maintainers to understand your use case and motivation.
  3. For large features which can't fit into one pr, it's better to have a draft pr to do a prototype, and after the community reached consensus in the direction, we could split the larger one into smaller ones. Ideally, for complex features, we should have a design doc to describe the overall design, which helps reviewers to understand the pr.

As an example of complex feature, currently the community is working on #1382 (thanks @CTTY ), where could be seen as an example of best practice.

liurenjie1024 avatar Oct 30 '25 14:10 liurenjie1024

As one the of authors, I also want to reply to some of the discussions of problems in current design.

One complaint I have is that the Rust implementation relies on the Java implementation as a reference; as opposed to just implementing the spec in Rust. As a result there are a lot of legacy constructs in the iceberg-rust crate that are (in my view) java-flavored; so trying to wield the crate within a Rust project isn't ergonomic at all, and because of that its easy to shoot yourself in the foot.

I agree that in early days some of the data structures are heavily inspired from java. This is because at that time java was the only reference implementation, and most widely used. I would argue that this means java's api design would typically be more future proof, e.g. it have evolved a long time to handle most case. I'm not saying what we are doing currently is the best design, and in fact the community is open to suggestions for refactoring.

liurenjie1024 avatar Oct 30 '25 14:10 liurenjie1024

Our ideal state is not using the IcebergTableProvider that is provided by this project out of the box. We went through a similar exercise with delta-rs and found the maintenance burden of coordinating DataFusion versions to delta-rs versions with our usage of DataFusion to be very difficult. We really like the approach that the delta-kernel-rs team took with providing a good set of primitives that can be used during planning, which we then use to hook into the advanced Parquet reading capabilities that DataFusion has (i.e. ParquetExec, ParquetAccessPlan, object_store, etc).

So our wishlist would be:

  • A "kernel" (similar to what delta-kernel does) that separates the planning from execution and makes it easy to integrate into a custom query engine.

In fact, iceberg-rust is organized in similar ways. This repo contains several crates, which could be categorized as following:

  • iceberg: This is similar to iceberg-core in java, which provides a lot of compute engine independent building blocks, such as planning api, transaction api, and data file reader/writers.
  • iceberg-catalog-*: These crates are concrete catalog implementations so that users don't need to include all dependencies.
  • integraiontes: These are crates which provide integrations with different engines. Currently the main focus is datafusion due to its extensibility.
  • Allow using object_store for the kernel IO (ref: https://github.com/apache/iceberg-rust/issues/172) instead of OpenDAL, since we are already heavily invested in it.

There are already undergoing effort for this part, see https://github.com/apache/iceberg-rust/pull/1755 (thanks @CTTY ).

  • A "reference" implementation of using the kernel (i.e. it could be IcebergTableProvider, but maybe just an example) that shows how to separate the planning of which files to read (and which rows to mask) with a deep integration into the DataFusion ParquetExec machinery. I think its fine to leave the IcebergTableProvider as a "batteries-included" provider that does everything using OpenDAL, as long as we had the primitives above.

The datafusion integration, e.g. IcebergTableProvider could be used for this purpose.

liurenjie1024 avatar Oct 30 '25 14:10 liurenjie1024

Many great answers. I'm already seeing a growing community here. I don't think forks are a big issue if they don't diverge. We can open more Discussions / Epics to align on the directions and roadmaps on topics like separation of planning and execution. I'm a big fan of what @alamb has been practicing for datafusion.

manuzhang avatar Oct 31 '25 02:10 manuzhang

Thank you all for joining this discussion.

This thread hasn’t been active for a while, so let me summarize. I’ll also list the next steps as issues. Some of these are already comments from @liurenjie1024, but putting them all in one place makes it easier to read and track.

Please note that this summary includes my personal opinions and preferences, it’s not an official response from the iceberg-rust community.

Review is currently the bottleneck

Many people mentioned that review is the current bottleneck. That’s also my personal feeling. Some suggested we should engage more people to join reviews and encourage contributors to split PRs into smaller chunks to make them easier to review. Those suggestions are absolutely true and valid.

But I also feel we can adapt our review style somehow. Instead of requiring contributors to address all issues, we can encourage them to create follow-ups (note: these don’t need to be finished by the same person). Some of these follow-ups make great first issues that can help more people join the community.

I created an issue https://github.com/apache/iceberg-rust/issues/1815 to track this.

Iceberg Rust is too Java-like

This concern is also true: iceberg-rust itself feels too much like Java.

It’s partly the history of how iceberg-rust was decided. The early core group of contributors consisted of two kinds of people: one group familiar with Iceberg but not yet experienced in Rust, who wanted to learn Rust by contributing to iceberg-rust; the other group familiar with Rust but knowing very little about Iceberg, who wanted to learn more about Iceberg through contribution.

So in either case, following the same pattern from Java wasn’t a bad option. Otherwise, we wouldn’t have been able to build iceberg-rust into its current shape.

The other reason, as @Twuebi and @liurenjie1024 mentioned, is that although Iceberg has a great spec and the spec intends to be language-agnostic and allow different languages to have their own implementations, that never truly happened. In reality, many things aren’t defined or clearly spelled out in the Iceberg spec, so we have to refer to Iceberg-java’s current behavior to make things work.

I’m not here to defend ourselves. It’s just the reality we’re facing. To improve in this area, we need to find a way to make our implementation correct and gradually improve it to be more friendly to Rust users.

I’ve opened this issue https://github.com/apache/iceberg-rust/issues/1816 to collect feedback on coding style.

Iceberg Kernel

I think it’s a great idea to split iceberg-kernel so we can separate the planning and execution stages.

We can follow delta-rs’s pattern to build it, providing a good abstraction for users to implement their own engine. In this kernel, we’ll let users plug in their own file IO and execution engine.

I’ve started this issue https://github.com/apache/iceberg-rust/issues/1817 to track progress on iceberg-kernel.

Feature requests

Some people mentioned specific issues like:

  • https://github.com/apache/iceberg-rust/pull/1486
  • https://github.com/apache/iceberg-rust/issues/1604

We can revisit them when needed.


All the issues I mentioned previously have been collected in tracking issue https://github.com/apache/iceberg-rust/issues/1818. Feel free to add more feedback. We’ll track it in the same issue. We’ll keep this issue open until it’s no longer an issue.

Thank you everyone for joining the discussion and sharing your honest feedback again.

Xuanwo avatar Nov 03 '25 12:11 Xuanwo

@Xuanwo it looks like the OpenDAL / object-store discussion (https://github.com/apache/iceberg-rust/issues/172) didn't continue to make progress. Can we add to the tracking issue https://github.com/apache/iceberg-rust/issues/1818 too?

Java-like and OpenDAL are two of Spice AI's main concerns.

lukekim avatar Nov 05 '25 23:11 lukekim

Hi @lukekim , we are still making progress on adopting custom storage layer like object-store, please see

  • https://github.com/apache/iceberg-rust/issues/1314
  • https://github.com/apache/iceberg-rust/pull/1755
  • Design doc: https://docs.google.com/document/d/1-CEvRvb52vPTDLnzwJRBx5KLpej7oSlTu_rg0qKEGZ8/edit?tab=t.dgr4vjtmzh92#heading=h.xhzuq2u2mr64

CTTY avatar Nov 06 '25 07:11 CTTY