Underspecified behaviour when evaluating Composite Literals

Open jcp19 opened this issue 4 years ago • 1 comments

We are not taking into account that the order of evaluation of the components of composite literals is pretty much undefined, apart from function calls, as mentioned here. Taking an example from the go documentation,

a := 1
f := func() int { a++; return a }
x := []int{a, f()}            // x may be [1, 2] or [2, 2]: evaluation order between a and f() is not specified
m := map[int]int{a: 1, a: 2}  // m may be {2: 1} or {2: 2}: evaluation order between the two map assignments is not specified
n := map[int]int{a: f()}      // n may be {2: 3} or {3: 3}: evaluation order between the key and the value is not specified

Right now, Gobra evaluates expressions in the order they occur in a composite literal. To address this undefined behaviour, we would either need an exponential encoding to account for all execution orders, or more reasonably, reject expressions that contain side effects when they occur inside a composite literal

May 06 '21 11:05 jcp19

This part of the language specifications seems relatively new to me. Back when I added literals, I did not find that the semantics for composite literals are specified so I treated them the same way as calls. I got the same results when checking with the Go compiler. I guess the order of evaluation can change depending on the content of the registers.

There are several solutions:

The easiest one, and the one that we should do for now, is to forbid expressions with side-effects as parameters for composite literals. We already have the check whether something is a side-effect free expression, so this could be implented within minutes.
The hardest one is to actually generate all possible execution orders. With a naive approach, there is a crazy amount of possible orders. Consider a literal with n arguments. All execution orders should be the same number as the number of permutations, i.e. n!. We can reduce this number by distinguishing between expressions with and without side-effect. If our n arguments consist of a expressions with and b expressions without side-effect, then the number of execution order can be expressed with the functions f(a,b) = sum_{i = 0}^{b}{ (b choose i) * a * f(a-1,b-i) } and f(0,b) = 1. Do not ask me what the closed form of that is, but for a == n it is n!, for a == 0 it is 1. and for a == 1 it is 2^b. An exponential number of execution orders is still too big, meaning that a more fine grained distinction is necessary. We could further split the side-effect free expressions into heap-dependent and heap-independent expressions. The heap-independent expressions do not influence the number of execution orders that we have to consider, meaning that for a expressions with side-effects and h heap-dependent expressions we have f(a,h) number of exection orders to consider. To reduce the number further, we have to analyse which effects can impact which arguments.

EDIT: I was a bit blind with the math. The function is of course just f(a,b) = a!(a+1)^b. This is quite intuitive and also follows rather easily by applying the binomial law.

May 06 '21 12:05 Felalolf