plutus icon indicating copy to clipboard operation
plutus copied to clipboard

Could untyped plutus core have primitives which don't need initial force

Open phadej opened this issue 4 years ago • 19 comments

by looking at some contracts' plutus core, there are some of (force mkCons), (force trace), (force ifThenElse) for example. I guess in these forcings are cheap to store and execute, but they aren't entirely free.

EDIT: I think I haven't seen these combinators used unforced.

phadej avatar Nov 02 '21 12:11 phadej

These are somewhat artificial. Type abstractions in typed PLC block evaluation, so we erase abstractions to delay and instantiations to force. Builtins can have type arguments, so the corresponding type instantiations get turned into forces. We thought the simplest and most consistent approach was just to maintain these, although strictly speaking they're redundant.

michaelpj avatar Nov 02 '21 12:11 michaelpj

Checking some more, E.g. sndPair has always two forces (for both of pair types), chooseList has also two, If each force is a byte, there's plenty of bytes we could save.

phadej avatar Nov 02 '21 14:11 phadej

The primitives don't appear that much in the grand scheme of things. If we were concerned about this then the number of force/delays introduced from erasing type abstractions/instantiations throughout the whole program is going to dwarf that. But I don't know of a good way to get rid of them en masse (we do the obvious force/delay cancellation already).

(Also term tags are actually only 4 bits, not a whole byte.)

michaelpj avatar Nov 02 '21 14:11 michaelpj

I run a quick test (rewrite Force b@Builtin{} -> b) on use-cases Uniswap contract:

        expected: (11462,11001)
         but got: (11011,10777)

The counts are (node size, serialized size of term using DeBruijn names), that's 4% reduction in node count and 2% serialized size. I think that is significant in the grand scheme of things.

phadej avatar Nov 02 '21 16:11 phadej

Thanks, that's a useful datapoint!

Let me elaborate a bit more on why this is tricky.

Consider the following program (in pseudo-PIR):

let usesTrace :: (forall a. BuiltinString -> a -> a) -> Integer -> Integer
     usesTrace tracer i = tracer "I saw it" i
     evilTrace :: forall a . BuiltinString -> a -> a
     evilTrace s v = error "I'm evil"

     i1 = usesTrace trace 0
     i2 = usesTrace evilTrace 1
in i1 + i2

The usesTrace function receives as its argument a polymorphic function which it instantiates. Builtin functions are just values, so we can pass a polymorphic builtin function as its argument. Therefore whatever usesTrace does has to work for both "normal" functions and builtins. The first thing that it will do is to instantiate the function, which will correspond to a force when erased.

So at the very least it must be tolerable to force builtin functions. At the moment we require you to force exactly where you would otherwise pass a type. Perhaps we could not require you to do any forcing. The only problematic case would be a nullary builtin which had only type arguments and immediately failed (e.g. error :: forall a . a as a builtin) - it's not clear what that would mean. We could just ban them.

I have a feeling we've been through this discussion before, @kwxm do you remember anything else?

michaelpj avatar Nov 02 '21 17:11 michaelpj

I don't agree, it is not tricky:

You can do rewrite in two steps:

  1. Replace each builtin with a version which is delayed as many times as there are erased type arguments
  2. Apply forceDelay simplification

Let me write (builtin b) for polymorphic builtin, and (builtin! b) for one which is absorbed force arguments (i.e. cannot be forced).

Your example would be rewritten as:

usesTrace (builtin trace) 0 ->          -- replace with delayed versions
usesTrace (delay (builtin! trace)) 0    -- note: builtin!

And indeed, in the compiled PLC there is

[ [ usesTrace (builtin trace) ] (con integer 0) ]

That grows in size, but that doesn't actually happen. GHC rarely passes polymorphic arguments (there should be an explicit rank-N signature, no eta-expansion).

If we rewrite the example as (which is closer to what inferred types would be):

errorExample2 :: Integer
errorExample2 = 
    let usesTrace :: (BuiltinString -> Integer -> Integer) -> Integer -> Integer
        usesTrace tracer i = tracer "I saw it" i

        evilTrace :: forall a . BuiltinString -> a -> a
        evilTrace s v = traceError "I'm evil"

        i1 = usesTrace trace 0
        i2 = usesTrace evilTrace 1
    in i1 + i2

The line above becomes

[ [ usesTrace (force (builtin trace)) ] (con integer 0) ]

which would simplify:

usesTrace (force (builtin trace)) 0 ->           -- replace with delayed versions
usesTrace (force (delay (builtin! trace))) 0 ->  -- forceDelay
usesTrace (builtin! trace) 0                     -- profit!

The polymorphic program has size 54, simple one 53, and latter could become smaller.


EDIT: What does (force (builtin trace)) evaluate to? Or is it stuck until there is enough arguments to execute the builtin?

phadej avatar Nov 02 '21 17:11 phadej

That grows in size, but that doesn't actually happen. GHC rarely passes polymorphic arguments (have to an explicit rank-N signature).

Yes, you're certainly right about that, but even if it's a corner case we thought it was important for builtins to behave exactly the same as polymorphic functions. That way you don't have to worry about the corner cases.

You can do rewrite in two steps...

Nice idea. This assumes that all the type arguments come first, but at the moment our builtins can have interleaved term and type arguments. This is pretty pointless however, and we might as well require the type arguments to come first.

If we did that and banned nullary builtins, I think this could work.

What does (force (builtin trace)) evaluate to? Or is it stuck until there is enough arguments to execute the builtin?

It's stuck.

michaelpj avatar Nov 02 '21 18:11 michaelpj

This assumes that all the type arguments come first, but at the moment our builtins can have interleaved term and type arguments. This is pretty pointless however, and we might as well require the type arguments to come first.

I think that removing the head type-arguments is already an improvement. If it's possible to shuffle the type arguments to the front, that would be even better, but do you have such builtins? (The specification doesn't list their types, nor haddocks, list their types, that would be helpful).

phadej avatar Nov 02 '21 18:11 phadej

It looks like that types of builtins are generated by toBuiltinMeaning?

I think GHC pushes all foralls to the front in inferred types. That trick is cool but feels dangerous, I'd prefer having types of builtins checked (against explicitly written ones) rather than inferred from their Haskell-meaning.

EDIT: if (U)PLC is a compilation target the types or "call convention" (in case of UPLC) is an important information to have in the spec.

phadej avatar Nov 02 '21 18:11 phadej

What does (force (builtin trace)) evaluate to? Or is it stuck until there is enough arguments to execute the builtin?

It's stuck.

This is unfortunate. I liked the "builtins have to fully saturated" design more (when there isn't such problem). It's easier to show that semantics are preserved then. (My proposal would been completely trivial then).

phadej avatar Nov 02 '21 19:11 phadej

We had a long argument about saturated vs unsaturated builtins. In the end we opted for unsaturated - perhaps a mistake, but here we are.

michaelpj avatar Nov 02 '21 20:11 michaelpj

You can do rewrite in two steps...

Nice idea. This assumes that all the type arguments come first, but at the moment our builtins can have interleaved term and type arguments. This is pretty pointless however, and we might as well require the type arguments to come first.

If we did that and banned nullary builtins, I think this could work.

We've been able to gather some interest on this specific topic and Oleg's proposal. @michaelpj @phadej Should we add it to #4174 and attempt it in the upcoming weeks?

kk-hainq avatar Nov 03 '21 18:11 kk-hainq

This is an intrusive proposal as it changes the UPLC. Everything listed in #4174 doesn't seem to require changes, even in typed PLC, as they can happen on PIR-level, which makes them a lot more realistic to get done :)

I opened this issue "early" because I expect it take long to fix, if it will ever.

phadej avatar Nov 03 '21 19:11 phadej

I know, but it is well-scoped and elegant. We also suspect the improvements would be non-trivial, and our transpilers would take fewer forces every day of the week!

kk-hainq avatar Nov 03 '21 19:11 kk-hainq

Yes, this would be a change to the semantics of the language, and might well require a new language version and changes to the spec (which we're in the process of redoing already). It's a much more complicated change, and I don't think it will be that significant in the grand scheme of things. It's something we can hopefully roll in easily enough when we do do another version.

michaelpj avatar Nov 04 '21 12:11 michaelpj

Okay, I'll just add this to #4174 for tracking for now.

kk-hainq avatar Nov 04 '21 19:11 kk-hainq

@phadej

If it's possible to shuffle the type arguments to the front, that would be even better, but do you have such builtins?

We don't.

(The specification doesn't list their types, nor haddocks, list their types, that would be helpful).

See the golden files. Generating a haddock comment using doctest or whatever would be quite helpful, I agree.

It looks like that types of builtins are generated by toBuiltinMeaning?

Within toBuiltinMeaning, yes.

I think GHC pushes all foralls to the front in inferred types. That trick is cool but feels dangerous, I'd prefer having types of builtins checked (against explicitly written ones) rather than inferred from their Haskell-meaning.

We don't rely on GHC pushing the foralls to the front (nor would it be possible at all, I think). Using an inference machinery and not checking what it infers would be very dangerous, I agree, but we have golden tests for that purpose. A golden file is created automatically whenever a new builtin is added, so we do control what PLC types get inferred for builtins.

EDIT: if (U)PLC is a compilation target the types or "call convention" (in case of UPLC) is an important information to have in the spec.

effectfully avatar Jan 01 '22 11:01 effectfully

(The specification doesn't list their types, nor haddocks, list their types, that would be helpful).

Sorry, I just noticed this after it was quoted in a previous comment.

The plc and uplc commands now have a print-builtin-signatures option that will print out the types of all the builtins it knows about in a reasonably readable form:

$ cabal exec uplc print-builtin-signatures
addInteger               : [ integer, integer ] -> integer
subtractInteger          : [ integer, integer ] -> integer
...
encodeUtf8               : [ string ] -> bytestring
decodeUtf8               : [ bytestring ] -> string
ifThenElse               : [ forall a, bool, a, a ] -> a
chooseUnit               : [ forall a, unit, a ] -> a
trace                    : [ forall a, string, a ] -> a
fstPair                  : [ forall a, forall b, pair(a, b) ] -> a
sndPair                  : [ forall a, forall b, pair(a, b) ] -> b
chooseList               : [ forall a, forall b, list(a), b, b ] -> b
mkCons                   : [ forall a, a, list(a) ] -> list(a)
...

That still doesn't quite tell the full story because sometimes forall means "for all built-in types" (as in mkCons) and sometimes it means "for any term at all" (as in ifThenElse). We're working on a new version of the specification which will explain all this in detail: that should be available in a couple of weeks.

kwxm avatar Jan 06 '22 13:01 kwxm

@kwxm that's great.

phadej avatar Jan 06 '22 13:01 phadej

There's now a CIP for this discussion, hence I'm closing the ticket. Feel free to reopen if you disagree.

effectfully avatar Feb 22 '23 18:02 effectfully