tm1py icon indicating copy to clipboard operation
tm1py copied to clipboard

Implement Bring Your Own DataFrame (BYOD) Strategy

Open MariusWirtz opened this issue 3 months ago • 2 comments

Summary

Implement a Bring-Your-Own-DataFrame (BYOD) strategy in TM1py to support both pandas and polars as interchangeable DataFrame backends.

Description

Currently, TM1py functions that work with tabular data rely on pandas DataFrames. To increase flexibility and performance, these functions should be able to accept either pandas or polars DataFrames as input and return the same type as output.

Both pandas and polars should be optional dependencies to keep the core installation lightweight.

Proposed Changes

  • Update all functions that handle DataFrames (e.g., write_dataframe, execute_view_dataframe, etc.) to:

    • Accept either pandas or polars DataFrames.
    • Preserve the user’s chosen DataFrame type in outputs.
  • Add lightweight detection logic to determine which backend is being used.

  • Introduce optional dependencies in setup.py (e.g., tm1py[pandas], tm1py[polars]).

Motivation

Preliminary testing shows promising results with polars:

  • ~10% faster (end to end) write operations.

  • ~20% lower memory usage during large dataset handling.

This approach enables users to choose their preferred DataFrame engine without sacrificing TM1py’s ease of use.

Example

# Using pandas
df = pandas.DataFrame(...)
tm1.cubes.cells.write_dataframe(df, use_blob=True)

# Using polars
df = polars.DataFrame(...)
tm1.cubes.cells.write_dataframe(df, use_blob=True)

Benefits

  • Improved performance and memory efficiency for large workloads.
  • Greater flexibility for developers using different DataFrame ecosystems.
  • Backward compatibility with existing pandas-based code.

Next Steps

  • Identify all functions currently requiring pandas DataFrames.
  • Abstract common DataFrame operations (indexing, melting, etc.) to backend-neutral utilities.
  • Add test coverage for both backends.
  • Update documentation accordingly.

MariusWirtz avatar Nov 07 '25 09:11 MariusWirtz

Hi @MariusWirtz Marius, just curious.

LazyFrame is a powerful feature in Polars that makes it fundamentally different from pandas. any plans or thought to support it in some use cases like, importing huge data to TM1 or exporting from TM1.

Cubewise-JoeCHK avatar Nov 07 '25 10:11 Cubewise-JoeCHK

For the exchange in either direction the dataframe would have to be fully materialised I guess. Not sure if lazy compute is a great fit here. We will publish a draft branch soon. Let's get some stats on memory and performance!

MariusWirtz avatar Nov 07 '25 10:11 MariusWirtz