mlkit icon indicating copy to clipboard operation
mlkit copied to clipboard

Unboxed float arguments

Open melsman opened this issue 4 years ago • 0 comments

This PR allows for reals to be passed to functions unboxed in xmm registers. An arbitrary number of unboxed floats can be passed to functions (up to 8 in xmm registers (xmm0-xmm7, the remaining on the stack).

The internal language LambdaExp supports both boxed reals (of type real) and unboxed floats (of type f64). Values of type f64 are not allowed to be stored inside data structures. The optimiser (found in OptLambda) performs a number of unboxing transformations:

  1. Inside a function, expressions and variables of type real are converted to expressions of type f64 at consumption sites and operations on values of type real are translated into operations on values of type f64. Notice that values that are not only consumed (perhaps also stored in a data structure) are not represented unboxed.
  2. When a function takes a tuple (records are translated into tuples) as argument and the tuple is consumed (see below), it is passed to the function unboxed.
  3. An element of type real in the tuple is passed unboxed if it is consumed by the function; see below.
  4. An optimisation seeks to uncurry curried functions when possible.

Notice that a tuple (or a real) is consumed by a function if the prospect is not used as a value in its own right (e.g., stored in a reference cell, passed boxed to another function, or used in a constructed value).

To determine if a tuple element of type real is consumed by a function, mutually recursive functions (bound in a FIX construct) are analysed simultaneously.

FIX b1...bn    b ::= f xs = e

The i'th argument x:real of a function f is consumed if all occurrences of x in the body of f are consumed. More formally,

  • x is consumed in let y = x in e if both y is consumed in e and x is consumed in e
  • x is consumed in e if x not in fv(e)
  • x is consumed in __real_to_f64 x
  • x is consumed in f(..x..) if f consumes the arguments for which x is passed (assuming this property holds for the i'th position)
  • x is consumed in g(..x..) if g consumes its i'th argument

Once it is determined that an argument x is consumed by a function f, appearances of x inside the body of f are replaced with the expression __f64_to_real x, which makes the contexts in which x appears type correct. Moreover, calls to f inside the body of f are adjusted with __real_to_f64 wrappers around the appropriate argument.

This PR also takes care of extending the register allocation algorithm to support both float registers and general purpose registers [1].

Support for (multiple) unboxed function results will be treated in another PR.

[1] Lal George and Andrew W. Appel. 1996. Iterated register coalescing. ACM Trans. Program. Lang. Syst. 18, 3 (May 1996), 300–324. DOI:https://doi.org/10.1145/229542.229546

melsman avatar Mar 12 '21 16:03 melsman