Evaluate and improve the WRITE performance of On Demand Feature View
Expected Behavior
The On Demand Feature View has implementations with using Python nativ object and Pandas object as the input/output. We want to evaluate the performance in terms of Time. Moreover, we want to understand the bottleneck of the performance in code running efficiency level.
Current Behavior
Steps to reproduce
Specifications
- Version: 0.37.1
- Platform: Linux/Ubuntu
- Subsystem:
Possible Solution
use cProfile and existing unit test functions.
An easy run shows the performance of Pandas is really bad comparing to Python native objects.
Example code of cProfile with snakeviz.
import cProfile
from scipy.optimize import minimize
def main():
# c.f. https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.minimize.html
func = lambda x: (x[0] - 1) ** 2 + (x[1] - 2.5) ** 2
x0 = (2, 0)
constraints = (
{"type": "ineq", "fun": lambda x: x[0] - 2 * x[1] + 2},
{"type": "ineq", "fun": lambda x: -x[0] - 2 * x[1] + 6},
{"type": "ineq", "fun": lambda x: -x[0] + 2 * x[1] + 2},
)
bounds = ((0, None), (0, None))
result = minimize(func, x0, method="SLSQP", bounds=bounds, constraints=constraints)
print(f"result:\n{result}")
print(f"best fit parameters: {result.x}")
if __name__ == "__main__":
profiler = cProfile.Profile()
profiler.enable()
main()
profiler.disable()
profiler.dump_stats("example.prof")
Done here: https://github.com/franciscojavierarceo/Python/pull/23
Pandas
Python
Comparison
Local performance Pandas runtime = 8.65ms Python runtime = 0.907ms
Nearly 10x reduction in processing time.
Done here: franciscojavierarceo/Python#23
Pandas
Python
Comparison
Local performance Pandas runtime = 8.65ms Python runtime = 0.907ms
Nearly 10x reduction in processing time.
really cool. Is the unit right? ~1ms or ~10ms is super fast :)

