Type of a function with flexible array size outputs
For typing purposes, what is the type of a function with flexible array size outputs? In particular, would such a function be type compatible with non-flexible arrays? I.e. should something like this be allowed:
function f
input Integer x[:, :];
input Integer n;
output Integer y[:, :];
algorithm
if n == 1 then
y := ones(size(x, 1), size(x, 2));
else
y := ones(size(x, 2), size(x, 1));
end if;
end f;
model M
Integer x[:, :] = {{1, 2, 3}, {4, 5, 6}};
Integer y[:, :] = f(x, 2); // size of y is determined by function call
end M;
This particular example does work in OpenModelica, but it doesn't seem sane to require evaluating the function call just to determine the size of a variable. The specification doesn't seem to explicitly disallow this though.
10.1 only says that the size of a variable with : dimensions can be determined from its binding equation, but doesn't go into any detail on how. It also says "The size cannot be determined from other equations or algorithms", which is kind of what happens here but only indirectly.
To me it seems like a flexible array should only be type compatible with another flexible array, such that such a function may only be called inside another function or used as a function argument.
This was also briefly brought up in #2164, but I thought it might be better with a dedicated issue for it.
I support the idea that the size of the output necessary for dimensions should be determinable without the need to evaluate the function.
I support the idea that the size of the output necessary for dimensions should be determinable without the need to evaluate the function.
No, no, no, no, no! Such change will break existing applications; and there are a few cases in my mind where unknown output sizes are used:
- Modelica functions used for scripting.
- We have the Model Management library, which provides means to query/analyse models, for exampel, to find all experiments within a package, find annotations etc. This is used to write scripts, for example, to simulate all experiments within a package etc.
- The requirements modeling library of EDF very likely uses such algorithms to, for example, collect all condition results for statistics.
- Optimization libraries likely use this; for example, find all entries (their indexes) within a matrix that are above a certain threshhold, construct a sparse representation etc.
- Just check for the builtin operator
catapplications. You likely will hit cases where thecatis guarded by anifor in aforin algorithms. This are all cases where some multidimension is constructed; if it contributes to the outputs, voilà.
This are just a few use-cases that come to my mind immediately. I guess there are some more subtle issues when we restrict/change.
Regarding the question of statically known sizes -- which we want -- consider that when used in a model, that the callsite also determines the size by the expected size of the output. For example when you have y = foo(), we know which dimensionality y has according to its declaration; hence, we can derive which size the output of foo must have.
I support the idea that the size of the output necessary for dimensions should be determinable without the need to evaluate the function.
No, no, no, no, no! Such change will break existing applications; and there are a few cases in my mind where unknown output sizes are used:
1. Modelica functions used for scripting. 2. We have the Model Management library, which provides means to query/analyse models, for exampel, to find all experiments within a package, find annotations etc. This is used to write scripts, for example, to simulate all experiments within a package etc. 3. The requirements modeling library of EDF very likely uses such algorithms to, for example, collect all condition results for statistics. 4. Optimization libraries likely use this; for example, find all entries (their indexes) within a matrix that are above a certain threshhold, construct a sparse representation etc. 5. Just check for the builtin operator `cat` applications. You likely will hit cases where the `cat` is guarded by an `if` or in a `for` in algorithms. This are all cases where some multidimension is constructed; if it contributes to the outputs, voilà.This are just a few use-cases that come to my mind immediately. I guess there are some more subtle issues when we restrict/change.
Regarding the question of statically known sizes -- which we want -- consider that when used in a model, that the callsite also determines the size by the expected size of the output. For example when you have
y = foo(), we know which dimensionalityyhas according to its declaration; hence, we can derive which size the output offoomust have.
Are any of these cases used to determine the dimension size(s) of a variable in a model though? Having unknown output sizes is not in itself an issue, it's only an issue if the compiler is required to evaluate such a function during compilation in order to determine the size of a variable in a model.
For example when you have
y = foo(), we know which dimensionalityyhas according to its declaration; hence, we can derive which size the output offoomust have.
That's exactly the point. We don't know the size of y because it is declared in terms of foo(). Do note the :.
While we are stating that array dimensions shall be evaluable, https://github.com/modelica/ModelicaSpecification/blob/810534df5d61d9234141091100d168dca68be3a3/chapters/classes.tex#L181 I think we're missing something in the definition of evaluable expression: https://specification.modelica.org/master/operators-and-expressions.html#evaluable-expressions
How about adding the following sort of item for each of the expression variabilities?
-
size(functionCall(…), j)is an evaluable/parameter/… expression if the size of the j:th dimension of the (first) output variable has no other dependency on input variables than their sizes, and the argument expressions of the function call have evaluable/parameter/… sizes for the relevant dimensions.
I support the idea that the size of the output necessary for dimensions should be determinable without the need to evaluate the function.
No, no, no, no, no! Such change will break existing applications; and there are a few cases in my mind where unknown output sizes are used:
1. Modelica functions used for scripting. 2. We have the Model Management library, which provides means to query/analyse models, for exampel, to find all experiments within a package, find annotations etc. This is used to write scripts, for example, to simulate all experiments within a package etc. 3. The requirements modeling library of EDF very likely uses such algorithms to, for example, collect all condition results for statistics. 4. Optimization libraries likely use this; for example, find all entries (their indexes) within a matrix that are above a certain threshhold, construct a sparse representation etc. 5. Just check for the builtin operator `cat` applications. You likely will hit cases where the `cat` is guarded by an `if` or in a `for` in algorithms. This are all cases where some multidimension is constructed; if it contributes to the outputs, voilà.This are just a few use-cases that come to my mind immediately. I guess there are some more subtle issues when we restrict/change. Regarding the question of statically known sizes -- which we want -- consider that when used in a model, that the callsite also determines the size by the expected size of the output. For example when you have
y = foo(), we know which dimensionalityyhas according to its declaration; hence, we can derive which size the output offoomust have.Are any of these cases used to determine the dimension size(s) of a variable in a model though? Having unknown output sizes is not in itself an issue, it's only an issue if the compiler is required to evaluate such a function during compilation in order to determine the size of a variable in a model.
Example 3, the requirements modeling library of EDF, likely is used in models. If I remember right, they check the requirements while simulation (it is not a preprocessing); and as mentioned before, to generate nice logs while simulating they likely collect requireemnts in some kind of vector of size [:] and defined by the collection function. But I am really just guessing here; I did not investigate it in a long time.
I mentioned it before, but I think a good starting point for finding current applications is to search for cat operator applications in models. Because these are good hints that some multidimension is build up according to some runtime logic (i.e., runtime conditions). Hopefully all such cat applications are "interactive" functions and not used in models.
Take the example for output specified by [:] from the specification:
function collectPositive
input Real x[:];
output Real xpos[:];
algorithm
for i in 1 : size(x, 1) loop
if x[i] > 0 then
xpos := cat(1, xpos, x[i:i]);
end if;
end for;
end collectPositive;
The size of the output depends on how many positive values are in the input. Depending on what is in the input the size may change during the simulation. We do have in Chapter 10:
The number of dimensions of an array is fixed and cannot be changed at run-time.
But if collectPositive is allowed to be used outside of another function call, one can write:
model M
Real x[5] = {time-0.1*i for i in 1:5};
Real y[:] = collectPositive(x);
end M;
@christoff-buerger can you give a specific example where an output size is truly given as [:] (not as size(<of input>)) and is used outside of a function?
We will have to investigate this - the use of such an array in other functions used in the model (and also interactively) is clearly important. The evaluated call of a function with literal inputs (thus evaluating to getting the size) seem less important, and I haven't seen any dynamic size being directly used for model variables (yet).
But if collectPositive is allowed to be used outside of another function call, one can write:
model M Real x[5] = {time-0.1*i for i in 1:5}; Real y[:] = collectPositive(x); end M;@christoff-buerger can you give a specific example where an output size is truly given as
[:](not assize(<of input>)) and is used outside of a function?
That the dimension of y must be evaluable according to the specification makes this model invalid. However, the problem isn't related to being inside of a function or not, as illustrated by this small variation:
model M
Real x[5] = {time-0.1*i for i in 1:5};
Real y = sum(collectPositive(x));
end M;
That the dimension of
ymust be evaluable according to the specification makes this model invalid.
This is exactly what we are trying to establish in this issue: are functions with flexible array outputs evaluable or not? What we have in the specification right now about evaluable expressions (3.8.3) is: • Except for the special built-in operators initial, terminal, der, edge, change, sample, and pre, a function or operator with evaluable subexpressions is an evaluable expression. • The sub-expression end used in A[... end ...] if A is a variable declared in a non-function class. • Some function calls are evaluable expressions even if the arguments are not: – cardinality(c), see restrictions for use in section 3.7.4.3. – size(A) (including size(A, j) where j is an evaluable expression) if A is variable declared in a non-function class. – Connections.isRoot(A.R) – Connections.rooted(A.R)
In the example in the opening comment of this issue:
function f input Integer x[:, :]; input Integer n; output Integer y[:, :]; algorithm if n == 1 then y := ones(size(x, 1), size(x, 2)); else y := ones(size(x, 2), size(x, 1)); end if; end f; model M Integer x[:, :] = {{1, 2, 3}, {4, 5, 6}}; Integer y[:, :] = f(x, 2); // size of y is determined by function call end M;
On the one hand, the argument x of f is not evaluable. On the other hand, the expressions for y are determined by size which is evaluable. And those size calls are within an if loop that is evaluable for the given argument n. So, it looks to be evaluable. But it can only be established by walking through the body of the algorithm. We have many examples of functions where the arguments are not evaluable but the output dimensions are fixed in the relation to the size of the unevaluable arguments or some evaluable arguments. So, is size of f evaluable? Can size of any function with flexible output array considered to be evaluable (if the arguments are not evaluable)? Should we require that we walk not only through the declarations but also through the algorithm to establish evaluability?
However, the problem isn't related to being inside of a function or not, as illustrated by this small variation:
model M Real x[5] = {time-0.1*i for i in 1:5}; Real y = sum(collectPositive(x)); end M;
@henrikt-ma what is the problem with this example? It looks good to me.
However, the problem isn't related to being inside of a function or not, as illustrated by this small variation:
model M Real x[5] = {time-0.1*i for i in 1:5}; Real y = sum(collectPositive(x)); end M;@henrikt-ma what is the problem with this example? It looks good to me.
What I was referring to as a problem was the fact that collectPositive can be used outside of a function. I agree that this shouldn't actually be a problem.
Should we require that we walk not only through the declarations but also through the algorithm to establish evaluability?
Can't we simply say that the size of a function call is not evaluable when the output array is declared flexible?
Should we require that we walk not only through the declarations but also through the algorithm to establish evaluability?
Can't we simply say that the size of a function call is not evaluable when the output array is declared flexible?
One special case is if the function call itself is evaluable.
One special case is if the function call itself is evaluable.
True. To avoid too much added complexity in the specification, I think we need to try to avoid introducing a formalization of expression with evaluable size (that could be used instead of a variable declared in a non-function class in the expression variability definition). Instead, I think that we should try to get away with only this simple definition:
- The size of an expression
expr(along dimensionj) is said to be evaluable ifsize(expr)(orsize(expr, j)) is an evaluable expression.
This would mean that we don't support a more intelligent analysis of function definitions, where it could have been seen that the size is evaluable even though the value of the expression is not. This could be illustrated with an example or a non-normative comment.
Was adding this (will revisit with new comment): The basic idea would be that we add:
- size(foo(x), j) is evaluable if:
- j is evaluable
- the first output is not declared as a flexible array
- the size expression for the array dimension of the first output of foo(x) is evaluable given the inputs (and )
Note that if foo(x) is evaluable it fails under another item (even if the array dimension is flexible)
Consider:
function foo
input Integer x[:];
input Integer n;
input Real z[:];
output Real y[n+size(x,1)];
end foo;
Then size(foo(a, m, b), 1) is only evaluable if size(a,1)+m is evaluable. This indicates that the significant parts are:
- Some inputs may be no be relevant
- For some input we only need the size to be evaluable, not the input itself.
Was adding this (will revisit with new comment): The basic idea would be that we add:
I also started elaborating something like this, and then came to the conclusion that it would add too much complexity. Maybe we're forced to do it in the end, but before we start I think we need to see some compelling use cases.