NetRAX icon indicating copy to clipboard operation
NetRAX copied to clipboard

Switch to Search in Waves

Open lutteropp opened this issue 5 years ago • 9 comments

In the whiteboard meeting today, we discovered some challenges with the current network topology search algorithm.

  • The arc insertion move neighborhood is way too large. Also, ideally one would allow each proposed arc insertion move to do some extra search with only horizontal moves, before ruling it out via BIC.
  • Instead, if we search in waves and fix the number of reticulations in each (horizontal) wave, we don't even need to directly find the best position for an arc insertion anymore. It will automatically get sorted out by the horizontal rearrangement moves later, avoiding that costly O(n^3) arc insertion neighborhood.

lutteropp avatar Dec 08 '20 11:12 lutteropp

Done in https://github.com/lutteropp/NetRAX/commit/dd6dd2a04507a5bd5f3629fc0d0f39775fdf3ff3.

lutteropp avatar Dec 09 '20 12:12 lutteropp

I entirely rewrote it in https://github.com/lutteropp/NetRAX/commit/fa5057591b8baa1aaf7a47c75d592ce03cba1df4. NetRAX is still fighting with failing assertions here and there (some branch lengths issue, likely caused by me trying to switch to only use the branch_lengths array in network inference/ likely broke something there)...

But it already looking like this search strategy finds the network much faster. Also, it only has to start from raxml-ng best tree.

lutteropp avatar Dec 09 '20 14:12 lutteropp

Search in waves rocks! :) Here a few more arguments for it, taken from the Slack channel:

  • It is faster than the naive algo

  • It works pretty well, even when used with just a single start tree

  • It is more likely to accept a reticulation (it gives it a fair chance by re-optimizing topology for the current number of reticulations before deciding. This is, it only checks via BIC after the current wave search finished)

  • We can use whatever likelihood function we want in there. Heck, we could even reintroduce that NEPAL likelihood function, to be used within each wave! And then we can use the (slower to compute) "correct" likelihood definition just whenever we have to decide between networks of different complexity...

lutteropp avatar Dec 09 '20 14:12 lutteropp

I highly advocate for exclusively using the search in waves strategy, at least in the first paper. It has so much optimization potential! I especially like that it allows us to (as a heuristic) reuse the old blob optimization code etc (NEPAL likelihood) within the waves...

lutteropp avatar Dec 09 '20 14:12 lutteropp

So far, search in waves just randomly adds a reticulation to the new network for the next wave (the within-wave horizontal moves are then used to reposition the reticulation). Later (for the second paper), we can try to speed this up by making use of the ideas from the whiteboard meeting (pdf attached) to find positions where a reticulation is likely more beneficial... Whiteboard Meeting.pdf

lutteropp avatar Dec 09 '20 14:12 lutteropp

Another reason why search in waves might be working so much more nicely right now: Only adding/removing reticulations messes up the pmatrix- and clv-indices (because they change number of nodes/edges). The search in waves could just be better at avoiding some bugs that can happen when not correctly taking care of these index changes...

lutteropp avatar Dec 09 '20 15:12 lutteropp

that all sounds great Sarah :-)

On 09.12.20 17:01, Sarah Lutteropp wrote:

Another reason why search in waves might be working so much more nicely right now: Only adding/removing reticulations messes up the pmatrix- and clv-indices (because they change number of nodes/edges). The search in waves could just be better at avoiding some bugs that can happen when not correctly taking care of these index changes...

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/lutteropp/NetRAX/issues/30#issuecomment-741827245, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGXB6TMY6CV4T6FAVTNY73ST6GLVANCNFSM4UR2EP2A.

-- Alexandros (Alexis) Stamatakis

Research Group Leader, Heidelberg Institute for Theoretical Studies Full Professor, Dept. of Informatics, Karlsruhe Institute of Technology

www.exelixis-lab.org

stamatak avatar Dec 09 '20 18:12 stamatak

So waves is the strategy in this drawing ?

Screenshot 2020-12-18 at 09 41 57

If yes, I totally agree that we go for this, and I thought that I had already advocated for it at the beginning of the project (in slack so I am not sure and we will never fund the info, github is super!)

It is also discussed in the "moves" paper with Fabio.

celinescornavacca avatar Dec 18 '20 08:12 celinescornavacca

@celinescornavacca Yes! :) NetRAX fully switched to the search in waves now. I have also added an arrow back, to check whether an arc removal in the best found n+1-reticulations network leads to a better n-reticulation-network than what we encountered so far. If this is the case, redo the n+1-reticulation-search from that updated/improved n-reticulation network.

lutteropp avatar Jan 12 '21 09:01 lutteropp