linfa Adding Multi-Task ElasticNet support

The goal of this PR is to add multi-task ElasticNet to the elasticnet crate.

A quick roadmap:

[x] Write block coordinate descent
[x] Write dual gap for multi-task
[x] Write tests for BCD
[x] Make CD variable names consistent with variable names in BCD
[x] Use a for loop when updating the residuals in CD
[ ] Z-score + confidence level + variance
[ ] Adapt Linfa ElasticNet API to the multi-task case
[x] Write tests for ElasticNet multi-task

Jan 19 '22 19:01 PABannier

@YuhanLiin i'm implementing the Fit trait for the multi-task ENET. However, I don't know how to deal with Linfa API to handle the multi-task case. I created a MultiTaskElasticNet struct, but since both ElasticNet and MultiTaskElasticNet have the same set of parameters, I didn't create a MultiTaskElasticNetValidParams.

My question is how to restrict the trait bounds to implement the Fit trait for multi-task dataset?

Jan 19 '22 21:01 PABannier

Is it possible to make functions like coordinate_descent, duality_gap, variance_params, and compute_intercept generic across 1D and 2D arrays? That way we won't need 2 sets of similar helper functions to handle the single-task and multi-task cases.

Jan 23 '22 01:01 YuhanLiin

@YuhanLiin Thanks for all the remarks! I passed your comments on the latest commit.

As for making coordinate_descent generic across 1D and 2D arrays, I don't think it is a good idea. coordinate_descent and block_coordinate_descent rely on two different proximal operators and there are more for loops in block_coordinate_descent specifically designed for the multi-task case. IMHO, a wiser choice would be to let coordinate_descent and block_coordinate_descent separate and use them as backbones for different penalties in the single task or multi task case. My intuition is that in future PR we could make it such that these two functions are generic over the regularization used in a model (for now it only supports L1 + L2), but there are more complex penalties that are very useful as well (non-convex ones for instance, see SCAD or MCP).

duality_gap is also very specific in the single task or multi-task case and introducing a match syntax would make things more confusing IMHO.

For variance_params and compute_intercept I can certainly make it generic over 1D or 2D arrays.

Jan 23 '22 18:01 PABannier

Making variance_params and compute_intercept generic would be great.

Jan 31 '22 02:01 YuhanLiin

Since #206 has been merged, ElasticNet is now easier to adapt to the multi-task case. I'm still working on it. Roadmap before merging:

[ ] Make variance_params generic for single task and multi-task
[ ] Make compute_intercept generic for single task and multi-task
[ ] Write tests for MultiTaskElasticNet (Lasso + Ridge...)

Mar 20 '22 13:03 PABannier

Work continued in #238

Aug 14 '22 21:08 YuhanLiin