Speeding up primes2bnet

Open PauBadiaM opened this issue 1 year ago • 1 comments

Hi @hklarner,

First let me tell you how happy I am that this package exists, the implementation is clean and the docs excellent, I really appreciate it!

I am analyzing several boolean networks and I've observed that a computational bottleneck seems to always be the conversion of bnet rules to primes. Sometimes it can take more than an hour to do such conversion. For example, this "simple" network takes ~4 seconds to be processed.

ARID5B,    ARID5B
BACH2,     BACH2
BCL11A,    BCL11A
HHEX,      HHEX
IKZF3,     IKZF3
MXD1,      MXD1
SPIB,      SPIB
TCF4,      TCF4
XBP1,      XBP1
EBF1,      TCF4 | SPIB | SOX5 | PAX5 | IKZF3 | EBF1 | BCL11A | BACH2 | ARID5B
PAX5,      !MXD1 & TCF4 & !XBP1 | !MXD1 & SPIB & !XBP1 | !MXD1 & PAX5 & !XBP1 | IKZF3 & !MXD1 & !XBP1 | HHEX & !MXD1 & !XBP1 | EBF1 & !MXD1 & !XBP1 | BCL11A & !MXD1 & !XBP1 | BACH2 & !MXD1 & !XBP1 | ARID5B & !MXD1 & !XBP1
SOX5,      !MXD1 & !XBP1
POU2AF1,   SPIB | SOX5 | PAX5 | EBF1 | BCL11A

Is there a rule of thumb that I can follow to minimize this overload when using bnet2primes? 🤔 My rules always follow this format, a collection of activators with |s, and a collection of inhibitors with &s. What would be your advice? Many Thanks!

Feb 06 '25 10:02 PauBadiaM

Hi Paul!

The conversion of a bnet file to a primes object is done with a custom c++ program called BnetToPrimes that follows a standard approach, namely the enumeration and merging of minterms, see

The algorithm is restarted locally for each node of your network, so the running time is the sum of the local running times. For each node it grows exponentially with the number of variables it depends on. In your example network the node PAX5 has in-degree of 11, namely the node set

{'BCL11A', 'EBF1', 'HHEX', 'BACH2', 'XBP1', 'ARID5B', 'SPIB', 'IKZF3', 'TCF4', 'PAX5', 'MXD1'}

and this is responsible for the ~4 seconds.

So, you can easily have a network with hundreds of nodes and the conversion will be fast if the max in-degree is reasonable, say around 5. On the other hand the conversion will take very long even in a network of 15 nodes if one node depends on all other nodes (in-degree = 15).

My advice for speeding things up would be to consider modelling nodes with high in-degree by helper nodes to buffer the high in-degrees. For example, maybe PAX5 depends on a compound of MXD1 and EBF1, so create a new node that models the presence of the compound MAXD1-EBF1 and reduce the in-degree of PAX5 by one.

Other than that, you should of course save primes that took a long time to compute and load them from disk.

Feb 06 '25 11:02 hklarner