pyvsc Random distribution of 2-variable constraints not uniform

[ Using pyvsc-0.7.6 ]

I've been curious about how uniform the random distributions for SV and pyvsc constraints are and cooked up some 2D examples to help visualize the distribution with heatmaps. The heatmaps show pyvsc making some interesting patterns rather than being flat and I was curious whether that was expected. The same constraints in SV appear to be fairly uniform. The heatmap goes from 0/black to the max value / white.

The constraints for a triangle, used in the examples below:

@vsc.randobj
class my_item_c(object):
    def __init__(self):
        self.a = vsc.rand_bit_t(32)
        self.b = vsc.rand_bit_t(32)

    # Triangle
    @vsc.constraint
    def ab_c(self):
        self.a >= 0
        self.b >= 0
        self.a <= 60 # Prevent overflow solutions
        self.b <= 60 # Prevent overflow solutions
        self.a + self.b <= 50

Triangle using pyvsc (100k randomizations, ranging from 14 to 831 hits per bucket): triangle_pyvsc_100k_180persec

Triangle using SV (100k randomizations, ranging from 49 to 105 hits per bucket): triangle_other_100k_116kpersec

Source for the above SV and pyvsc examples of a simple triangle constraint (x + y <= N). The dist.py script is pretty simple and has two functions to choose from for either a live animation of the randomizations or a performance measurement. The SV file and plot.py are just the same triangle and circle constraints I was testing out. https://github.com/alwilson/pyvsc_testing

Note: I've also found that SV constraint solving is very fast (100k simple 2-var randomizations in ~2s) and that pyvsc is much slower (~2000s), which I should say I'm not surprised at, although I'd love to know better why and the differences between the solvers/generators.

I also tried putting a "hole" in the triangle constraints, which you can see in my source, but chaining constraints with or/and doesn't seem to do the same thing as SV does. Not sure if it's a pyvsc bug or if there's a different why to write out those constraints.

Thanks for developing this project! I've been having fun with it. >:)

Aug 18 '22 01:08 alwilson

Hi @alwilson, thanks for putting the work into creating this way to look at 2d distributions. Thus far, I've primarily been looking at 1d distributions -- and that's difficult enough in text format given a reasonably-sized domain. I'm a bit surprised that the distribution is as skewed as it is, and will need to dig in a bit more to understand why. I find it very interesting that there appear to be hot spots (min a, max b) and (max a, min b) -- hopefully a clue that leads in a productive direction!

Again, thanks for putting together this way to visualize PyVSC distributions. Hopefully I'll shortly have some insights on how to improve the results!

Aug 18 '22 01:08 mballance

The great part about python is all the built in graphing and data crunching tools!

I've seen similar imbalances in my results, but they've been good enough for my uses up to this point.

Aug 18 '22 01:08 qzcx

Had some fun with histograms and thought I'd post them here, too. I'll upload my changes once I clean things up. A good excuse for playing around with matplotlib and numpy!

pyvsc (100k samples) pyvsc_100k

python random.randrange calls (100k samples) I rewrote the constraints as a conditional and used the random library to get something fast that's similar to what SV outputs to compare with. python_rand_100k

Aug 18 '22 05:08 alwilson

After reviewing the 'swizzling' implementation, I have a better understanding of the hot spots at (min,max) and (max,min). While the variables to swizzle in each solve set are randomly selected, the bits to force in each were being applied sequentially. This increased the probability that few of the bit-force constraints on the second variable would hold, allowing Boolector to repeatedly-select the same solution for the second variable. Release 0.7.7 randomizes the order in which bit-force constraints are applied. From a visual inspection of text output, this appears to provide a good improvement in multi-variable distribution. Please give the new release a try as you're able.

Aug 18 '22 17:08 mballance

That looks better! On the histogram it looks like it's a bit more spread out and the hotspots spread out and lowered. Still doesn't have that normal distribution that I think we would want. Although it's been a while since I've thought about probability...

pyvsc triangle (100k samples) pyvsc_triangle_100k_7_7

I did want to point out that in a 1M quarter circle constraint in SV the distribution does not appear to be normal. Or at least it tends to have hotspots as well. So I'm not sure what the gold standard for pyvsc should be. Adding multiplication constraints into the mix perhaps makes it harder to guarantee?

Aug 18 '22 19:08 alwilson

Hello :)

Any news on this issue ?

Jul 09 '24 04:07 minwooim