Dat3M icon indicating copy to clipboard operation
Dat3M copied to clipboard

New Pointer Analysis

Open xeren opened this issue 2 years ago • 29 comments

This PR started as a rework for FieldSensitiveAlias. It now features a third implementation InclusionBasedPointerAnalysis.

  • It manages pointer modifiers of the type (int static offset‘,List<Integer> dynamic offsets). This should make its precision robust against optimizations of the following form. (FieldSensitiveAndersen uses (int,int) and thus looses precision there)
      w = x + y
      address = w + z
      ↓
      address = x + y + z
  • It stores inclusions, loads, stores and pointer sets in a compact format, such that its memory usage scales better with large arrays.
  • Instead of a node for each register or allocated byte, this version manages a node for each complex expression, register writer and phi node. This means this version not only field sensitive, but also flow sensitive (with the exception of assumptions). It now requires a prior Dependency Analysis.
  • It implements a unification rule: If a variable has only one include edge, it gets merged into that other variable. This enhances the mustAlias queries.
  • It implements an acceleration rule: If a variable directly or indirectly includes itself with a static offset of c != 0, then c is promoted to a dynamic offset. Such cycles propagate at most once in total, instead of at most once per allocated byte. This should also allow the algorithm to terminate even if the program has arrays of unknown/nondeterministic size (if such a feature is ever added in the future).

xeren avatar Feb 15 '24 17:02 xeren

Does the first bullet point mean the new analysis is always more precise than the old ones? Do we have any concrete example where we get noticeable better performance?

hernanponcedeleon avatar Feb 15 '24 19:02 hernanponcedeleon

@xeren showed me improved results on EBR with substantial speed up.

Btw. I think this PR does not set the new analysis as default, so the CI still runs with the old one. I think you should put the new analysis as default to catch any problems.

ThomasHaas avatar Feb 15 '24 22:02 ThomasHaas

@ThomasHaas would it make sense to add EBR to our CI? IIRC it is supposed to be buggy wrt IMM but correct under hardware models

hernanponcedeleon avatar Feb 17 '24 18:02 hernanponcedeleon

@ThomasHaas would it make sense to add EBR to our CI? IIRC it is supposed to be buggy wrt IMM but correct under hardware models

I don't know how much time it takes with Yices2. With Z3, running EBR under multiple models is expensive.

ThomasHaas avatar Feb 17 '24 23:02 ThomasHaas

I have just tested this analysis on RCU tree and with B=1 it terminates in 2 seconds and with B=2 in 20 seconds. The old analysis took multiple hours IIRC. However, it computes quite a few empty pointer sets. I also cannot tell how precise it is cause the fast termination may be due to imprecision caused by RCU having the whole memory stored in arrays.

ThomasHaas avatar Feb 18 '24 10:02 ThomasHaas

It seems that when I rebase the branch, I overwrote the default alias type. It should be fixed by now.

BTW, as Thomas said, the new alias analysis shows to be better than the old one in the benchmarks added in #621, here are the statistics

[20.02.2024] 11:56:47 [INFO] AliasAnalysis.fromConfig - Selected alias analysis: FULL
======== RelationAnalysis summary ======== 
        #Relations: 104
        #Axioms: 4
        #may-edges removed (extended): 86907
        #must-edges added (extended): 156
        total #must|may|exclusive edges: 1689106|2239829|0
        #must|may rf edges: 4|18859
        #must|may co edges: 474|486
===========================================

[20.02.2024] 11:58:00 [INFO] AliasAnalysis.fromConfig - Selected alias analysis: FIELD_SENSITIVE
======== RelationAnalysis summary ======== 
        #Relations: 104
        #Axioms: 4
        #may-edges removed (extended): 94306
        #must-edges added (extended): 156
        total #must|may|exclusive edges: 1686959|2287321|0
        #must|may rf edges: 4|21522
        #must|may co edges: 474|486
===========================================

hernanponcedeleon avatar Feb 20 '24 11:02 hernanponcedeleon

It seems that when I rebase the branch, I overwrote the default alias type. It should be fixed by now.

I think the default you think about was not changed by @xeren. We have two defaults and the one you changed was for the UI only I think.

ThomasHaas avatar Feb 20 '24 11:02 ThomasHaas

It seems you are right (even if I'm pretty sure I saw field_sensitive in the log when I called dartagnan from the console with no alias options).

We should be able to use Alias.getDefault() similarly to what we do with method, target, ... right?

hernanponcedeleon avatar Feb 20 '24 15:02 hernanponcedeleon

We should be able to use Alias.getDefault() similarly to what we do with method, target, ... right?

I think so. Just change the initializer in AliasAnalysis.Config to call Alias.getDefault().

ThomasHaas avatar Feb 21 '24 10:02 ThomasHaas

I am testing this branch in some new benchmarks and I get a bunch of warnings like this

[04.03.2024] 00:14:56 [WARN] InclusionBasedPointerAnalysis.postProcess - empty pointer set for bv64 117:r24 = load(*bv64 117:r23)

How should I interpret this @xeren ? It sounds like this load cannot read from anywhere

hernanponcedeleon avatar Mar 03 '24 23:03 hernanponcedeleon

Yes, this means the are no possible targets to read from. It could be a bug in the analysis but also a correct result. For example, this warning also appeared in tree.c but it turned out that the code does indeed access a NULL pointer.

ThomasHaas avatar Mar 03 '24 23:03 ThomasHaas

Yes, this means the are no possible targets to read from. It could be a bug in the analysis but also a correct result. For example, this warning also appeared in tree.c but it turned out that the code does indeed access a NULL pointer.

But then this is a bug in the code being analyzed ...

hernanponcedeleon avatar Mar 04 '24 00:03 hernanponcedeleon

Yes, this means the are no possible targets to read from. It could be a bug in the analysis but also a correct result. For example, this warning also appeared in tree.c but it turned out that the code does indeed access a NULL pointer.

But then this is a bug in the code being analyzed ...

Yes, it might be if that memory access is reachable. @xeren Can you add source location information into the warning so we can more easily check what is wrong?

ThomasHaas avatar Mar 04 '24 08:03 ThomasHaas

@xeren any progress a out this PR? I would like to get this merged soon

hernanponcedeleon avatar Mar 14 '24 13:03 hernanponcedeleon

I implemented the flattening pass (*) that I mentioned a while ago on top of this branch (see alias-flat) and something seems fishy to me.

The alias analysis produces more precise results when flattening is used (and thus it needs to handle more complex expressions; I would have expected that it would loose precision in some cases) than using the "default" IR

java -jar dartagnan/target/dartagnan.jar cat/aarch64.cat --target=arm8 dartagnan/src/test/resources/libvsync/bounded_mpmc_check_empty-opt.ll --bound=1 --program.processing.flattening=true
======== RelationAnalysis summary ======== 
        #Relations: 16
        #Axioms: 7
        #may-edges removed (extended): 0
        #must-edges added (extended): 0
        total #must|may|exclusive edges: 81839|82970|0
        #must|may rf edges: 4|308
        #must|may co edges: 40|178
===========================================
java -jar dartagnan/target/dartagnan.jar cat/aarch64.cat --target=arm8 dartagnan/src/test/resources/libvsync/bounded_mpmc_check_empty-opt.ll --bound=1 --program.processing.flattening=false
======== RelationAnalysis summary ======== 
        #Relations: 16
        #Axioms: 7
        #may-edges removed (extended): 0
        #must-edges added (extended): 0
        total #must|may|exclusive edges: 93322|94770|0
        #must|may rf edges: 4|318
        #must|may co edges: 40|196
===========================================

(*) the pass replaces registers in reg-readers by the expression that register carries withing the block (the pass could be more precise by preserving info between blocks, but this is irrelevant ATM).

hernanponcedeleon avatar Mar 25 '24 00:03 hernanponcedeleon

The alias analysis produces more precise results when flattening is used (and thus it needs to handle more complex expressions; I would have expected that it would loose precision in some cases) than using the "default" IR

Your flattening pass is wrong though. Consider

r1 = r2;
r2 = ...;
store(&x, r1 + r2);
// ---- After flattening ----
r1 = r2;
r2 = ...;
store(&x, r2 + ...); // Wrong

You cannot propagate expressions over register assignments that affect any of the registers contained in the expression.

I have not checked if this case appears in your benchmark example, but for now I would assume your pass is the culprit. Btw. you also should reset the propagation map for every function (or just make it a local variable), cause theoretically it tries to propagate expressions across different functions (should never happen due to disjoint registers, but still).

EDIT: Even if I fix your flattening pass, the discrepancy is still there. So there is indeed something fishy.

ThomasHaas avatar Mar 25 '24 09:03 ThomasHaas

Can you push your fix so @xeren can take a look?

hernanponcedeleon avatar Mar 25 '24 13:03 hernanponcedeleon

Brief: This analysis is not robust against general flattening. However, it should be robust against flattening of addition.

I looked into your example. It features flattening of an expression r4 = r0 * 2 into r1 = r4 * 8. This analysis is not capable of representing that the variable for r4 only receives multiples of 2. Thus, without flattening, any event accessing address r1 will be assumed to access multiples of 8. When flattening is enabled, the analysis can deduce from r1 = (r0 * 2) * 8, that its result will always be a multiple of 16. Hence, flattening can still improve the alias results.

xeren avatar Mar 25 '24 15:03 xeren

@xeren I still want to have some documentation explaining how the algorithm works so I can understand it and properly review it

hernanponcedeleon avatar Mar 25 '24 17:03 hernanponcedeleon

Based on your initial description and my understanding of the algo, let me suggest another (hopefully better) description:

The analysis constructs a directed inclusion graph over expressions of the program.
The nodes abstractly represent the values an expression (at a program location) could take over all executions.
There is a node (also called variable in the code) for each memory object (addr), two for each Local (result and rhs), Load (result and addr), and Store (addr and value).
These nodes are not just per expression but also per program location to achieve an SSA-like form (*).

Edges between two nodes A and B describe one of three relationships:
	(1) An edge "A -f_includes-> B" means that "f(A) \subseteq B" holds (where A/B are understood as sets of values). 
	    Rather than arbitrary functions "f", we use multi-linear expressions "f(A) = A + k1*x1 + k2*x2 + ... kn*xn + o" where ki and o are constants.
	    For simplicity, we write "k*x + o" where k=(k1,...,kn) and x=(x1,...,xn) are understood as vectors, and drop the _includes suffix (i.e. write A -f-> B).
	    The (sub)expression "k*x + o" describes the set of values {k*x + o | x \in Z^n} that can be obtained by varying x freely (**).
	(2) An edge "A -stores-> B" means that there exists a store that stores at address-expression A the value-expression B.
	(3) An edge "A -loads-> R" means that there exists a load that loads from address-expression A and puts its result into R (R is the node of a register).

Initially, the graph has the following edges
	- For every "store(A, V)" event, it has a "A -stores-> V" edge
	- For every "R = load(A)" event, it has a "A -loads-> R" edge.
	- For every assignment "R := B + k*x + o", we have a "B -(k*x+o)-> R" edge
	  For too complex assignments that do not match the structure, say "R := h(B1, B2, .., Bn)" where h is complex, we introduce edges "Bi -1x+0-> R".

On this initial graph, two rules are applied until saturation
(1) Transitivity: A -f_includes-> B AND B -g_includes-> C IMPLIES A -fg_includes-> C (notice that we need to compose the labels as well).
(2) Communication: if A -loads-> R and B -stores-> V and A and B can overlap (i.e. the corresponding Load/Store events can alias), then V -0x+0-> R (all values in V are included in R)

It can happen that these rules generate an edge A -f-> B although another edge A -g-> B already exists. 
In this case, the edges get merged by a join operation.
To guarantee termination, the saturation of cyclic inclusion relationships get accelerated:
E.g. if A includes B (B -0x+0-> A) and B includes A+2 (A -0x+2-> B)
then A includes A+2x, B includes A+2x+2 and both include B+2x.

(*) We also generate special phi nodes that can improve precision on code that is not in SSA form.
(**) In the implementation we make an unsound assumption that x is non-negative. 
     This disallows negative (dynamic) indexing into arrays/pointers but gives additional precision in other cases.
     Note that "o" can be negative which is sufficient to support "containerof" functionality.

@xeren Is the text I wrote correct? @hernanponcedeleon Is this text understandable to you?

ThomasHaas avatar Mar 28 '24 09:03 ThomasHaas

The text is great @ThomasHaas.

I still need to check that the implementation follows this, but the idea of the algorithm is certainly clear to me now.

hernanponcedeleon avatar Mar 28 '24 10:03 hernanponcedeleon

Your description looks fine to me. Assignments can also have multi-linear expressions.

The underlying algorithm has much in common with Sui & Lei's: Fast and Precise Handling of Positive Weight Cycles for Field-sensitive Pointer Analysis (SAS'19, DOI). Some exceptions:

  • There are no nodes for memory cells. Instead, if a store may alias with a load, their values are directly connected with an include-edge.
  • The pointer sets are embedded inside the graph. There are nodes for base addresses and the graph is transitively-closed. To get the pointer set of a variable, look at its in-set variables representing base addresses.
  • Offset uses $\langle o,0,\lbrace 1\rbrace\rangle$ instead of $\langle o,0,\emptyset\rangle$ to denote an entire object $o$, and $\langle o,i,\emptyset\rangle$ instead of $\langle o,i,\lbrace 0\rbrace\rangle$ for a single field $o.f_i$. Also, we allow negative offsets, since we sometimes get such programs (container_of).

xeren avatar Mar 28 '24 12:03 xeren

Your description looks fine to me. Assignments can also have multi-linear expressions.

Yes, that's why I have the part about k*x being understood as a vector.

The underlying algorithm has much in common with Sui & Lei's: Fast and Precise Handling of Positive Weight Cycles for Field-sensitive Pointer Analysis (SAS'19, DOI).

I looked for the Datalog-based analysis I told you about yesterday. Its name was not ccanalyzer but cclyzer from Structure-Sensitive Points-To Analysis for C and C++. There is a newer version for C++ on github (link). They also have an online documentation. Interestingly, this also computes the call-graph on the fly.

ThomasHaas avatar Mar 28 '24 12:03 ThomasHaas

Your description looks fine to me. Assignments can also have multi-linear expressions.

Can you then use this description for the code documentation?

hernanponcedeleon avatar Mar 28 '24 13:03 hernanponcedeleon

After looking into the code I found the different use-cases of Offset<...> very confusing, especially why Offset<Variable> is used instead of Variable (i.e., the offset itself acts as a variable). I believe this is done due to the following observation: if A -f_includes-> B is the only incoming includes-edge into B, then you not only know f(A) \subseteq B but even f(A) = B because you consider least solutions to the constraint graph. In those cases you can avoid the variable B and use f(A) instead (i.e. use Offset<Variable> instead of Variable). I guess this also improves precision, because if you kept B separately and computed the transitive closure of the graph, you would likely end up with many incoming include-edges and thus forget that f(A)=B holds (because you forget that all those edges are derived from a common "bottleneck" through A). At the same time, Offset<Variable> is also used to represent edge labelings + either source or target of that edge (and Offset<Offset<...>> was even used to represent a tuple!)This overload of usages makes the algorithm hard to follow.

Is there a way to make the different use-cases more apparent? Maybe by using different classes, or using interfaces or something? Maybe one could have Offset (non-generic) just describe the edge labeling and Variable have two fields (Variable, Offset) where you have either have (this, zero) for "pure variables" or (otherVar, offset) for"derived variables". Alternatively, you could use a separate class DerivedVariable which is a tuple (Variable, Offset). If you want to keep Offset<T> generic, you could also try to make the class abstract (rather than a record) and have DerivedVariable extends Offset<Variable> so that derived variables keep the same functionality but look differently (mimicking a typedef).

ThomasHaas avatar Mar 29 '24 11:03 ThomasHaas

I feel like your newest change has just shifted the variable vs. edge problem, no? Previously, you used edges that sometimes represented variables. Now you use variables that sometimes represent edges.

ThomasHaas avatar Apr 04 '24 10:04 ThomasHaas

I fixed a bug in the analysis, where the communication rule was not correctly implemented.

The rule states that from $(X+o_1\xleftarrow{\text{stores}}W+o_2)$, $(W\supseteq A+o_3)$, $(A+o_4\subseteq R)$, $(R+o_5\xrightarrow{\text{loads}}Y)$ and $((o_2+o_3)\cap(o_4+o_5)\ne\emptyset)$, you can derive $X+o_1\subseteq Y$.

Either $o_2$ or $o_5$ was skipped in the intersection test (depending on where the recently-learned inclusion edge is), resulting in both false positives and false negatives.

I feel like your newest change has just shifted the variable vs. edge problem, no? Previously, you used edges that sometimes represented variables. Now you use variables that sometimes represent edges.

Maybe you were referring to inclusion edges being typed as DerivedVariable. I have now separated both use cases into separate classes.

xeren avatar Apr 04 '24 15:04 xeren

I fixed a bug in the analysis, where the communication rule was not correctly implemented.

The rule states that from (X+o1←storesW+o2), (W⊇A+o3), (A+o4⊆R), (R+o5→loadsY) and ((o2+o3)∩(o4+o5)≠∅), you can derive X+o1⊆Y.

Either o2 or o5 was skipped in the intersection test (depending on where the recently-learned inclusion edge is), resulting in both false positives and false negatives.

Do you know of an instance of wrong behaviour? For example, does this affect qspinlock or ck_epoch or RCU?

I feel like your newest change has just shifted the variable vs. edge problem, no? Previously, you used edges that sometimes represented variables. Now you use variables that sometimes represent edges.

Maybe you were referring to inclusion edges being typed as DerivedVariable. I have now separated both use cases into separate classes.

Yes, that is what I was referring to.

ThomasHaas avatar Apr 04 '24 15:04 ThomasHaas

EDIT: I was wrong. The code uses join for sequential composition and addInto for parallel composition. The latter does not merge edges which contradicts the class description/comment, so the description should get updated.

~I think you are conflating parallel joining and sequential joining (composition) in your algorithm. These are two different things with different results and it is just accidental that your algo remains correct because you always join parallel edges where one of them is composed.~

~For example, parallel joining 2x + 2 and 4x + 2 should result in 2x + 2, i.e., the gcd is used for the dynamic offset and the offsets are not added. Sequential joining on the other hand would give 2x + 4y + 4 (or 2x + 4 if you combine multiple offsets) because the modifiers just get added. In the case where you would know that both dynamic offsets are the same variable, you would even get 6x + 4~

ThomasHaas avatar Apr 08 '24 09:04 ThomasHaas

Did you intentionally revert your recent changes? If so, you also reverted the changes to the handling of unary expressions, so they are still wrong.

ThomasHaas avatar Apr 27 '24 09:04 ThomasHaas