pyPESTO icon indicating copy to clipboard operation
pyPESTO copied to clipboard

Interfacing CeSS

Open paulflang opened this issue 5 years ago • 22 comments

I am working with a model for which multistart optimization does not find the ground truth optimum after 100 runs and 2.3 days of computation (with most found local minima being different from each other). However, CeSS global + adjoint sensitivity based local search finds the ground truth optimum after 1.3 days. I was therefore wondering if there are any plans to implement CeSS in pyPESTO (or parPE; see corresponding issue there).

paulflang avatar Apr 18 '20 00:04 paulflang

Might indeed be interesting, but it's not directly next on the list... What CeSS implementation do you have in mind? There are several ones, most of them from CSIC and the group of Julio Banga and Eva Balsa-Canto... It should be possible to interface some of them with pyPESTO rather easily, I think...

paulstapor avatar Apr 18 '20 15:04 paulstapor

It is planned to interface the one by Banga et al. at some point (currently only gradient-based local, and gradient-free global optimizers are supported in pyPESTO). Or which ESS implementation do you use?

yannikschaelte avatar Apr 21 '20 14:04 yannikschaelte

Thanks @yannikschaelte for getting back. I was using the Banga implementation. More specifically I used this code to run CeSS with some minor modifications.

paulflang avatar Apr 23 '20 09:04 paulflang

Problem here is that this is all Matlab code, and afaik there is no Python version of it available... How to deal with that?

  • Recode ourselves?
  • Talk to Julio and ask for a Python implementation?
  • Use the R-version of Meigo by R-to-Python-interface, which is however likely to be less well maintained?
  • Look for another implementation of scatter search, in Python?
  • Something else?

paulstapor avatar May 13 '20 20:05 paulstapor

Something else?

Calling Matlab from Python?

paulflang avatar May 13 '20 20:05 paulflang

Something else?

Calling Matlab from Python?

Not so sure we want to do this: Matlab needs a license... Might work for the moment, but not sure how good this is in the long run... On the other hand: Interfacing Matlab from Python would allow to directly compare Matlab optimizers with the ones from Python... :thinking:

paulstapor avatar May 13 '20 20:05 paulstapor

https://bitbucket.org/DavidPenas/sacess-library/src/master/ looks like good old C to me, not matlab.

FFroehlich avatar May 13 '20 20:05 FFroehlich

https://bitbucket.org/DavidPenas/sacess-library/src/master/ looks like good old C to me, not matlab.

yeah, sacess goes way beyond (C)eSS... and relies on MPI afaik, and is made for large computing clusters... Obviously, we can also interface this... Would however guess that this will comes with some problems concerning the implementation... If we can interface saCeSS, however, this would be really cool...

paulstapor avatar May 13 '20 20:05 paulstapor

https://bitbucket.org/DavidPenas/sacess-library/src/master/ looks like good old C to me, not matlab.

Includes good old Fortran as well :D

dweindl avatar May 13 '20 20:05 dweindl

On the other hand: Interfacing Matlab from Python would allow to directly compare Matlab optimizers with the ones from Python...

Having access to the Matlab fmincon optimizers alone would be awesome for comparability/benchmarking, but I rather doubt we will be able to offer continuous support/maintenance for Matlab based tools.

yannikschaelte avatar May 13 '20 21:05 yannikschaelte

Problem here is that this is all Matlab code, and afaik there is no Python version of it available... How to deal with that?

  • Recode ourselves?
  • Talk to Julio and ask for a Python implementation?
  • Use the R-version of Meigo by R-to-Python-interface, which is however likely to be less well maintained?
  • Look for another implementation of scatter search, in Python?
  • Something else?

There is an R implementation, and a Python port from R, offered by MEIGO. But before we look into that, I agree it should be clarified whether the R version is as well maintained as the Matlab one.

yannikschaelte avatar May 13 '20 21:05 yannikschaelte

Just asked: Julio said that one way might be using the interface to meigoR, pymeigo, which is written collaborates of his at the EBI. meigoR and Matlab-Meigo were at least at the same level, when Meigo was released (which was 2009), but the last update to the Bioconductor package was 2019... Alternatively, Julio also suggested to go for saCeSS, and interfacing C and Fortran90... And he pointed out that they have augmented metaheuristics to hyperheuristics, which works like the metaheuristics approach, but adds an additional level, where different global optimizers (not only an GA appraoch) are employed... However, this is written in Spark...

As however pymeigo can be installed via pypi, I think it might be indeed easiest to test this first...

paulstapor avatar May 14 '20 07:05 paulstapor

Both would be good imho. The latest release of pymeigo is from 2013 https://pypi.org/project/pymeigo/#history ... In particular, this version probably does not yet allow gradient-based optimization using e.g. AMICI, but only self-computed finite differences. So, it would be great as a derivative-free optimization tool (one of the best performing), but not yet applicable as a global gradient-based scheme, if I see correctly.

yannikschaelte avatar May 14 '20 08:05 yannikschaelte

https://bitbucket.org/DavidPenas/sacess-library/src/master/ looks like good old C to me, not matlab.

Oh, and if somebody decides to look into sacess code, I am happy to discuss. Would be interesting to have in parPE as well (https://github.com/ICB-DCM/parPE/issues/102).

dweindl avatar May 14 '20 08:05 dweindl

In particular, this version probably does not yet allow gradient-based optimization using e.g. AMICI

From my perspective, I have only tested eSS in combination with adjoint sensitivity analysis for fitting our cell cycle model. Not sure if self-computed finite differences would substantially reduce fitting performance.

Would it be hard to change the MEIGO R code to allow gradient-based optimization?

paulflang avatar May 14 '20 08:05 paulflang

In particular, this version probably does not yet allow gradient-based optimization using e.g. AMICI

From my perspective, I have only tested eSS in combination with adjoint sensitivity analysis for fitting our cell cycle model. Not sure if self-computed finite differences would substantially reduce fitting performance.

They do perform substantially worse, definitely. Numerically less stable, and in particular adjoints much faster.

Would it be hard to change the MEIGO R code to allow gradient-based optimization?

Shouldn't be too difficult. Depends on how it's implemented -- whether it allows to plug in own local optimizers, or whether the implemented/interfaced optimizers (like fmincon in Matlab-Meigo, where we did exactly that) allow to easily pass gradients.

yannikschaelte avatar May 14 '20 08:05 yannikschaelte

Has anybody ever looked at the actual code? I would imagine that just reimplementation of some of the features in python wouldnt be too much hassle might be much easier to maintain and given that nobody seems to maintain the other code julio might also be interested.

FFroehlich avatar May 14 '20 13:05 FFroehlich

This is why I brought up the question whether we might want to just recode this in the first place... :smile: Should be only some few hundred lines of code, highly useful for other persons if well implemented as separate toolbox, we could include ideas from latest improvements (such as the hyperheuristics concept), etc. At least the concept isn't too hyper-complex, in principle. Many things may depend on the precise implementation, however...

paulstapor avatar May 14 '20 14:05 paulstapor

I had a look at the eSS_kernel.m and essR.R scripts. They are about 1000 lines of code. In addition, they call a bunch of other functions that would need to be reimplemented, too. So in total it would be several thousand lines, I guess.

Nevertheless, if you come to the conclusion that reimplementing these in Python is the best way forward, I could try contributing to this. Some difficulties I envision are that I am not yet familiar with the optimizers implemented in the different languages and the quirks of speeding up Python code.

I have not yet had a look at the hyperheuristics in saCeSS(2).

paulflang avatar May 14 '20 15:05 paulflang

I had a look at the eSS_kernel.m and essR.R scripts. They are about 1000 lines of code. In addition, they call a bunch of other functions that would need to be reimplemented, too. So in total it would be several thousand lines, I guess.

ess_kernel is the main routine and the longest. Some of the other routines just call the local solvers and consist of rather longish if-elseif-switches, which handle the options for those solvers... And ess_kernel has many blank lines and 200 lines of comments in the beginning... ;)

paulstapor avatar May 14 '20 17:05 paulstapor

The main idea of eSS is

  1. Create a random initial population (called refset)
  2. perform some steps of genetic algorithm
  3. choose a "balanced" set of points from the refset to run local optimizations with loose tolerances from ("balanced" means: take some of the best points and some far away)
  4. create an improved refset after this "refinement" step and perform some more steps of GA
  5. choose some new starpoints for a final round of local optimization with more strict tolerances.

The main question is, I think, how to update the refset in GA, how to choose the points for the local optimizations and how to create the improved refset... It would be some work, indeed, but it's not impossible, I think...

paulstapor avatar May 14 '20 18:05 paulstapor

I guess these questions would become clear when reimplementing the code.

I am probably not in the position to judge what is the best way forward, but I still tried to coarsly summarize the pros and cons of the main options in the following table.

effort maintainance useful in pyPESTO useful in parPE useful for other developers
using pyMEIGO low* poor yes no no
reimplementing MEIGO moderate** self-managed yes no maybe
using saCeSS2# high good yes yes no

*depends on how to interface with AMICI **depends also on implementing hyperheuristics or not #would be beyond my capabilities to contibute

paulflang avatar May 15 '20 19:05 paulflang