LearnAPI.jl Add `fit

See discussion at #16

Mar 02 '23 22:03 ablaom

I think for transformers it would make sense to require exactly two methods:

fit_transform
transform

No need for a separate fit implementation for transformers. When an ML pipeline is first fit, all of the transformers in the pipeline have to do a fit and a transformation, so I don't think fit needs to be separate. And of course using fit_transform allows for optimizations, as @davidbp mentioned.

The signature of fit_transform would probably look like this:

fitted_transformer, Xout = fit_transform(transformer, Xin)

Oct 30 '23 19:10 CameronBieganek

Thanks for that suggestion.

| No need for a separate fit implementation for transformers.

I dunno. It seems like a non-trivial complication to the API. No separate fit means some special-casing in model composition. In MLJ we expect every model to have a fit and this is pretty central to the learning network stuff. (If you can stomach it, see our paper here). Conceptually, predict and transform are treated very similarly - they're just functions depending on a learned parameter that you generate with fit.

If fit_transform is just sugar for transform(fit(...)) then I don't think it's justified in a basic interface. Every name we add to the namespace should work hard to justify it's existence. Can you think of a use case where the optimisations gained are significant? I can imagine one avoids some data conversions (like DataFrame -> matrix -> DataFrame), but with the right "data front end" (which I'm still thinking about) this issue would have a workaround.

[I know some have argued for just one "operation" (predict, say) for added simplification, but this goes too far, in my view. In the current LearnAPI proposal, predict is distinguished by the fact that the ouput is a (proxy for a ) "target", a general notion we make reasonably precise in the docs, and we enable dispatch on the type of proxy. transform need not have this interpretation, but can have an "inverse". As we see from sk-learn and MLJ, allowing algorithms to implement more than one operationpredict / transform /inverse_transform is both natural and useful.]

Oct 31 '23 02:10 ablaom

Okay, here's a variation on your idea that doesn't require adding to the namespace. Each transformer implements one transform and one fit:

Case 1: static (non-generalizing) transformers

fit(strategy, X) -> model # storing `transformed_X` and any inspectable byproducts of algorithm
transform(model) -> model.transformed_X

with a convenience fallback

transform(strategy, X) = transform(fit(strategy, X))

Case 2: generalizing transformers:

fit(strategy, X) -> model # storing `transformed_X` and `learned_parameters` and any inspectable byproducts of algorithm
transform(model, Xnew) -> transformed_Xnew # uses `model.learned_parameters`

with a convenience fallback

transform(strategy, X) = transform(fit(strategy, X).transformed_X)

Oct 31 '23 03:10 ablaom

I'm not sure we'd want to keep a reference to an intermediate transformed data set in a trained transformer. That would prevent the garbage collector from freeing that memory as long as the pipeline is still around.

It also feels conceptually a little muddy, but that's just a feeling that I haven't been able to put into more concrete terms yet. :)

Oct 31 '23 14:10 CameronBieganek

Case 2: generalizing transformers:

fit(strategy, X) -> model # storing `learned_parameters` and any inspectable byproducts of algorithm
transform(model, Xnew) -> transformed_Xnew # uses `model.learned_parameters`

with a convenience fallback

transform(strategy, X) = transform(fit(strategy, X), X)

Oct 31 '23 19:10 ablaom

Hmm, that form doesn't allow for optimizations, but, as you said, maybe there's not really that many fit-then-transform cases that get a large benefit from optimizations.

Oct 31 '23 21:10 CameronBieganek

In #30 an implementation can explicitly overload transform(strategy, data) to provide a one-shot method with no issues. I think providing a universal fallback is a bad idea, as it could lead to type instabilities, and confusion in debugging ("hidden knowledge").

May 19 '24 05:05 ablaom

On dev, a learner can implement transform(learner, X), as shorthand for transform(fit(learner, X), X) or transform(fit(learner), X) for "static models".

Nov 02 '24 03:11 ablaom

Add `fit_transform`?

Case 1: static (non-generalizing) transformers

Case 2: generalizing transformers:

Case 2: generalizing transformers: