feast icon indicating copy to clipboard operation
feast copied to clipboard

Evaluate and improve the WRITE performance of On Demand Feature View

Open shuchu opened this issue 1 year ago • 1 comments

Expected Behavior

The On Demand Feature View has implementations with using Python nativ object and Pandas object as the input/output. We want to evaluate the performance in terms of Time. Moreover, we want to understand the bottleneck of the performance in code running efficiency level.

Current Behavior

Steps to reproduce

Specifications

  • Version: 0.37.1
  • Platform: Linux/Ubuntu
  • Subsystem:

Possible Solution

use cProfile and existing unit test functions.

shuchu avatar May 17 '24 03:05 shuchu

An easy run shows the performance of Pandas is really bad comparing to Python native objects.

Screenshot 2024-05-18 231742 Screenshot 2024-05-18 231751

shuchu avatar May 19 '24 03:05 shuchu

Example code of cProfile with snakeviz.

import cProfile

from scipy.optimize import minimize


def main():
    # c.f. https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.minimize.html
    func = lambda x: (x[0] - 1) ** 2 + (x[1] - 2.5) ** 2
    x0 = (2, 0)

    constraints = (
        {"type": "ineq", "fun": lambda x: x[0] - 2 * x[1] + 2},
        {"type": "ineq", "fun": lambda x: -x[0] - 2 * x[1] + 6},
        {"type": "ineq", "fun": lambda x: -x[0] + 2 * x[1] + 2},
    )

    bounds = ((0, None), (0, None))

    result = minimize(func, x0, method="SLSQP", bounds=bounds, constraints=constraints)
    print(f"result:\n{result}")
    print(f"best fit parameters: {result.x}")


if __name__ == "__main__":
    profiler = cProfile.Profile()
    profiler.enable()
    main()
    profiler.disable()
    profiler.dump_stats("example.prof")

franciscojavierarceo avatar Jun 06 '24 03:06 franciscojavierarceo

Done here: https://github.com/franciscojavierarceo/Python/pull/23

Pandas

image

Python

image

Comparison

Local performance Pandas runtime = 8.65ms Python runtime = 0.907ms

Nearly 10x reduction in processing time.

franciscojavierarceo avatar Jun 08 '24 01:06 franciscojavierarceo

Done here: franciscojavierarceo/Python#23

Pandas

image

Python

image

Comparison

Local performance Pandas runtime = 8.65ms Python runtime = 0.907ms

Nearly 10x reduction in processing time.

really cool. Is the unit right? ~1ms or ~10ms is super fast :)

HaoXuAI avatar Nov 21 '24 22:11 HaoXuAI