PySR icon indicating copy to clipboard operation
PySR copied to clipboard

[Feature]: Add complexity calculation for user defined expression

Open OsAmaro opened this issue 2 years ago • 5 comments

Feature Request

Hi. I've recently started using PySR and I would like to suggest a new feature that I think would make the code even more user-friendly.

Would it be possible to have more direct access to the function that computes the complexity such that one can compare expressions found by PySR and those found in the literature?

For example: model.complexity('1 + x_0 + x_1**2')

This would allow the user to easily map the expressions found in the literature on the complexity vs accuracy plots.

Thank you in advance.

OsAmaro avatar May 28 '23 10:05 OsAmaro

That's a good idea. This would probably have to be done by calling out to the Julia backend's compute_complexity function, and using jl.eval() to evaluate the expression.

It is a bit tricky because we would have to overload the user-defined operators to work for the expression type so that 1 + x_0 becomes an expression object. And there are some difficulties because you wouldn't want, e.g., 1 + 2 to automatically simplify to 3 before evaluation of the complexity.

Another option is to do the complexity calculation in pure Python, but it would add maintenance burden and also cause some issues due to the fact that some of the SymPy operators are mapped to multiple primitive operators (e.g., cos2(x)=cos(x^2) would count as cos and ^2, even though it would be a single operator.

MilesCranmer avatar May 28 '23 17:05 MilesCranmer

Should be much easier after the PR #429 passes. Perhaps we could make a function to convert a string into a SymbolicRegression.jl equation (via the use of @extend_operators on sr_options_).

MilesCranmer avatar Sep 17 '23 15:09 MilesCranmer

@OsAmaro – #430 fixes this. Could you please try it out?

MilesCranmer avatar Sep 17 '23 16:09 MilesCranmer

Hey @MilesCranmer,

I think this method works! I tried this PR on Docker for some examples and it seemed consistent. Appreciate your work. Ideally one would bypass the .fit entirely and just define the PySRRegressor model, but this is already very helpful! Many thanks!

Cheers, Óscar

OsAmaro avatar Sep 19 '23 12:09 OsAmaro

I could potentially define a method that runs all the setup steps involved in .fit but without running the actual equation search. Right now .fit turns on Julia, creates the Julia options, and imports the backend. But those could be perhaps refactored into a separate method...

(Any help appreciated, as professor life is quite busy 🙂 )

MilesCranmer avatar Sep 19 '23 13:09 MilesCranmer

Okay the functionality required has now been implemented as #564.

First, see https://github.com/MilesCranmer/PySR/discussions/550#discussioncomment-8842600 for how to create user-defined expressions in PySR.

Next, you can get complexity as follows:

model  # PySRRegressor that has already been fit, thus having `.julia_options_`
tree   # Expression you have defined by hand

jl.compute_complexity(tree, model.julia_options_)

MilesCranmer avatar Mar 24 '24 23:03 MilesCranmer