simpleeval icon indicating copy to clipboard operation
simpleeval copied to clipboard

Low performance

Open balintbarna opened this issue 3 years ago • 9 comments

I love what this library does but I have a bit of an issue. I have a use-case where I need to "evaluate" the same user-defined expression in a fast loop with a few dynamic input parameters that might change on each iteration.

I have tried both the native built-in eval and the safe eval from this library with a test scenario, evaluating the same expression with a few input parameters 100000 times, and I get a 50-60% worse performance with the safe eval. I think that's a good trade-off, however, with the unsafe eval I can take a shortcut:

If I put the expression into a lambda and evaluate it that way only once, I can use the lambda to call it on each subsequent occasion with the current parameters. This is over an order of magnitude faster than re-evaluating the expression every time.

Do you think it would be possible to implement a lambda evaluator in this library that would yield similar performance gains? Do you have an approach in mind for implementing lambdas?

balintbarna avatar Nov 25 '22 17:11 balintbarna

That is an interesting question!

And probably would be a good thing to mention performance vs native eval somewhere in the docs too.

Would https://github.com/danthedeckie/simpleeval/pull/115 be the kind of thing you're thinking of?

danthedeckie avatar Feb 08 '23 06:02 danthedeckie

I will have to test the performance with the new feature but it looks like it would be a great option.

balintbarna avatar Feb 08 '23 11:02 balintbarna

@danthedeckie I used a short test code to compare the 2 methods.

Results on 0.9.12:

2.75s call     <...>/test_evaluators.py::test_safe_evaluator
0.08s call     <...>/test_evaluators.py::test_lambda_evaluator

On 0.9.13 with the preparsing:

1.03s call     <...>/test_evaluators.py::test_safe_evaluator
0.08s call     <...>/test_evaluators.py::test_lambda_evaluator

Looks like the performance is improved a lot with preparsing but it's still over an order of magnitude slower than the lambda hack with the standard eval unfortunately.

ghost avatar Feb 10 '23 13:02 ghost

Looks like the performance is improved a lot with preparsing but it's still over an order of magnitude slower than the lambda hack with the standard eval unfortunately.

That's really encouraging that it's that much faster! Thank you!

To make it even faster still, I have a few ideas, but nothing easy, alas.

One really complex, but potentially massive improvement would be to compile to python bytecode, rather than step through the AST every time. But that's a lot of work.

There may well be a lot of smaller wins possible too - it's not really been designed with performance in mind. It would be really good to start a performance / benchmarking suite alongside the test suite.

danthedeckie avatar Feb 14 '23 18:02 danthedeckie

Yeah, and to be clear, I think this is already faster than just using eval every time, the eval solution is only this fast because of wrapping the expression in a lambda, parsing that, then only making subsequent calls to the pre-parsed function. So maybe it would be possible to have the same performance benefits through implementing lambda evaluation in this library, if that can be done without repeated calls to eval. I could post my test code later if that's helpful for setting up some performance benchmarks.

ghost avatar Feb 15 '23 09:02 ghost

That does sound really interesting - if you could post it that would be great!

danthedeckie avatar Feb 17 '23 09:02 danthedeckie

@danthedeckie

"""
run with: `pytest --durations=0 test_evaluators.py`
"""
# std
from typing import Any, Dict, Protocol, Sequence, Type
# 3rd party
from simpleeval import DEFAULT_OPERATORS, DEFAULT_NAMES, EvalWithCompoundTypes


# interface


class Evaluator(Protocol):
    def __init__(self, builtins: Dict[str, Any], keys: Sequence[str], expr: str): ...
    def evaluate(self, **kw) -> Any: ...


# implementations


class RegularEvaluator:
    def __init__(self, builtins: Dict[str, Any], keys, expr):
        self.builtins = builtins
        self.expr = expr

    def evaluate(self, **kw):
        # pylint: disable=eval-used
        return eval(self.expr, {'__builtins__': self.builtins, **kw})


class LambdaEvaluator:
    def __init__(self, builtins: Dict[str, Any], keys, expr):
        self.keys = keys
        params = ', '.join(keys)
        constructed_expr = f'lambda {params}: {expr}'
        self.callable = eval(constructed_expr, {'__builtins__': builtins})  # pylint: disable=eval-used

    def evaluate(self, **kw):
        names = {k: v for k, v in kw.items() if k in self.keys}
        return self.callable(**names)


class SafeEvaluator:
    def __init__(self, builtins: Dict[str, Any], keys, expr):
        self.evaluator = EvalWithCompoundTypes(
            operators=DEFAULT_OPERATORS,
            functions={**DEFAULT_NAMES, **builtins},
        )
        self.parsed = self.evaluator.parse(expr)
        self.expr = expr

    def evaluate(self, **kw):
        self.evaluator.names = kw
        return self.evaluator.eval(self.expr, previously_parsed=self.parsed)


# helper functions


def try_evaluator(cls: Type[Evaluator]):
    evaluator = cls(
        builtins=dict(
            str=str,
            result=2,
            joe=lambda: 'Joe'
        ),
        keys=[
            'x',
            'y',
        ],
        expr='" ".join([str(x + y - 1 == result and joe() == "Joe" or True is None), str(False)])',
    )
    for _ in range(100000):
        assert evaluator.evaluate(x=1, y=2) == "True False"


# tests


def test_regular_evaluator():
    try_evaluator(RegularEvaluator)


def test_lambda_evaluator():
    try_evaluator(LambdaEvaluator)


def test_safe_evaluator():
    try_evaluator(SafeEvaluator)

ghost avatar Feb 17 '23 12:02 ghost

Thanks @bbk-riact

danthedeckie avatar Feb 17 '23 12:02 danthedeckie

Thanks @danthedeckie for implementing the separate .parse() feature. I love it! :) Pre-parsing the AST and reusing it really speeds up the processing. I've decided to use an automatic cache which remembers the last-used expression and last-used AST tree. If the if expression != cached_expression, I call .parse() with the new expression. Works wonderfully. <3

Arcitec avatar Jan 31 '24 19:01 Arcitec

closing this because of inactivity

balintbarna avatar Jun 25 '24 19:06 balintbarna