lean4 doc: `lean.h`

This PR adds documentation comments for a few functions in lean.h. This was previously discussed at https://leanprover.zulipchat.com/#narrow/stream/270676-lean4/topic/Structure.20in.20FFI/near/431985775

I think it would be good if someone knowledgable (@digama0?) could check that the content of the comments is actually correct.

Apr 08 '24 18:04 marcusrossel

Mathlib CI status (docs):

❗ Mathlib CI can not be attempted yet, as the nightly-testing-2024-04-08 tag does not exist there yet. We will retry when you push more commits. If you rebase your branch onto nightly-with-mathlib, Mathlib CI should run now. (2024-04-08 19:12:29)
🟡 Mathlib branch lean-pr-testing-3846 build against this PR was cancelled. (2024-04-16 07:56:37) View Log
✅ Mathlib branch lean-pr-testing-3846 has successfully built against this PR. (2024-04-16 09:13:25) View Log
✅ Mathlib branch lean-pr-testing-3846 has successfully built against this PR. (2024-04-16 10:03:12) View Log

Apr 08 '24 19:04 leanprover-community-mathlib4-bot

By the way, the easiest way you can verify layout behavior is to make a test program Foo.lean containing e.g.

structure S where
  a : String
  b : Bool
  c : UInt64
  d : Array Nat
  e : UInt8

def foo (x : S) : UInt64 := x.c

and then compile it and look in Foo.c to find out what it did.

Spoiler:

S.a: lean_ctor_get(x_1, 0)
S.b: lean_ctor_get_uint8(x_1, sizeof(void*)*2 + 8)
S.c: lean_ctor_get_uint64(x_1, sizeof(void*)*2)
S.d: lean_ctor_get(x_1, 1)
S.e: lean_ctor_get_uint8(x_1, sizeof(void*)*2 + 9)

so we conclude that they are stored in the order [a, d; c, b, e].

Based on some additional testing, I conclude that the fields are bucketed by alignment and placed in decreasing order by alignment but otherwise in textual order. This ordering has the nice property that padding is never needed between fields. I'm not sure what happens on 32-bit architectures though, since in that case pointers are not the most-aligned things (UInt64 has alignment 8 but void* has alignment 4).

Apr 15 '24 12:04 digama0

From a documentation / explanation point of view, I'm not sure about the idea of spreading the layout algorithm over these individual functions. I think the best approach would be to have the layout algorithm in one place (e.g. ffi.md), and then have these functions just talk about how they relate to the lean data model, assuming the details of inductive type <-> layout are worked out.

Apr 15 '24 13:04 digama0

I added docs for the layout algorithm in #3915. This should take some of the pressure off to try to explain the details in individual accessors.

Apr 15 '24 14:04 digama0

I've removed my (incorrect) description of the layout algorithm now, instead linking to @digama0's section in the Lean Manual. I've kept the examples for lean_ctor_num_objs and lean_ctor_get as I think the layout for non-scalars is simple enough to be understood from them.

Apr 16 '24 07:04 marcusrossel

ping

Aug 22 '24 09:08 marcusrossel

@marcusrossel Thanks for the PR and sorry for the long wait. We took a look at the PR in the triage team meeting and decided that we would like to keep documentation on this level concentrated in ffi.md so it doesn't go out of sync. Also, lean_is_ctor and lean_ctor_num_objs are not relevant for FFI. Thus I'm going to close this PR as all relevant information should already exist in ffi.md.

Aug 30 '24 14:08 Kha