Matthias Roels comments

Results 30 comments of


                                            Matthias Roels

Backlog: add ModuleStorage to support packaging flow code along with flow code dependencies

Absolutely agree with the comment from Anna! A couple of remarks from a release management perspective though: storing flow code in S3 or e.g. GitHub instead of a container image...

Backlog: add ModuleStorage to support packaging flow code along with flow code dependencies

Alternatively, it could also be useful to be able to do something similar as https://github.com/anna-geller/packaging-prefect-flows/blob/master/flows_no_build/docker_script_kubernetes_run_custom_ecr_image.py in Orion…

Snyk can't scan kaniko-produced images

Is there any further update on this issue?

Easier CustomDataset Creation

That’s exactly what I always do: I use credentials as env vars. however, it would still be nice to be able to inject them in a dataset (think db credentials...

Rewrite the Polars datasets to not rely on fsspec unnecessarily

> Interesting to learn about why Rust choose object store instead of a general filesystem interface. @noklam: It is explained in the [docs](https://docs.rs/object_store/latest/object_store/#why-not-a-filesystem-interface).

Rewrite the Polars datasets to not rely on fsspec unnecessarily

I did some experiments to see where we are in using plain vanilla `read_*`, `scan_*` and `write_*` operations on object stores (for a local filesystem, they work as expected). -...

polars.LazyPolarsdataset .collect() streaming

As per my comment [here](https://github.com/kedro-org/kedro-plugins/issues/702#issuecomment-2195299774), I wouldn't recommend using streaming or `sink_*` methods. Even when using `.collect(streaming=True)`, it is explicitly mentioned [in the docs](https://docs.pola.rs/api/python/stable/reference/api/polars.LazyFrame.sink_parquet.html) that streaming mode is considered unstable.

Matthias Roels

Backlog: add ModuleStorage to support packaging flow code along with flow code dependencies

Backlog: add ModuleStorage to support packaging flow code along with flow code dependencies

Snyk can't scan kaniko-produced images

Easier CustomDataset Creation

Rewrite the Polars datasets to not rely on fsspec unnecessarily

Rewrite the Polars datasets to not rely on fsspec unnecessarily

polars.LazyPolarsdataset .collect() streaming

polars.LazyPolarsdataset .collect() streaming

[kedro-datasets] Upgrade to PySpark >= 3.4, Pandas >= 2 in `test_requirements.txt`

[kedro-datasets] Upgrade to PySpark >= 3.4, Pandas >= 2 in `test_requirements.txt`