Criteria for putting new unsafe-related text in chapters other than "Unsafe code"
Note: Since this issue was created, chapter (and maybe section) numbers have changed!
[Although this is an issue for V9, Jon suggested we discuss the general principle soon. That is, how hard should we try to keep all unsafe-related text in the unsafe chapter? Are there reasonable exceptions for not doing so?]
Support for unsafe mode is optional, and 1-2 years ago, we made the decision to push almost all the unsafe-related stuff into the unsafe chapter, §22, except for bits in the grammar, which we’ve flagged as “unsafe-mode only.”
I’ve nearly completed getting the MS v9 proposal for function pointers into shape for use by TG2. However, I have a situation for which I’m looking for guidance.
The addition of support for function pointers requires unsafe support, so most of the spec for that proposal will necessarily go in §22. However, this feature also impacts type inference, as described in six subsections of §11.6.3, “Type inference.”
I see two alternate approaches:
-
Following the current approach of putting as much as possible unsafe stuff into 22, I can add text to the corresponding sections in §11.6.3.* saying “This subclause is extended in unsafe code (§forward-pointer to §22.x.y).” And then describe those extensions in the new sections §22.x.y.
-
I can put the unsafe-related stuff in-line in §11.6.3.*, and somehow mark it as being unsafe-related.
Approach 1 is pure, but has the problem that some text in §22.x.y needs to be “merged” into specific places in lists in §11.6.3.*. For example, this would result in the following:
§22.6.(x) Output type inferences [new section]
In §11.6.3.7, the following bullet is added between the second and third bullets:
- If
Eis an address-of method group andTis a function pointer type with parameter typesT1..Tkand return typeTb, and overload resolution ofEwith the typesT1..Tkyields a single method with return typeU, then a lower-bound inference is made fromUtoTb.
The reader of this new section, §22.6.x, will have to flip between this text and that in §11.6.3.7 to make sense of it. And then, we are at the mercy of this positional dependence of that list, which could easily get out of sync as §11.6.3.7 evolves.
Here’s a similar case:
§22.6.(y) Lower-bound inferences [new section]
In §11.6.3.10, the following case is added to the third bullet:
Vis a function pointer typedelegate*<V2..Vk, V1>and there is a function pointer typedelegate*<U2..Uk, U1>such thatUis identical todelegate*<U2..Uk, U1>, and the calling convention ofVis identical toU, and the refness ofViis identical toUi.
The first bullet of inference from
UitoViis modified to:
- If
Uis not a function pointer type andUiis not known to be a reference type, or ifUis a function pointer type andUiis not known to be a function pointer type or a reference type, then an exact inference is made
Then, added after the third bullet of inference from
UitoVi:
- Otherwise, if
Visdelegate*<V2..Vk, V1>then inference depends on the i-th parameter ofdelegate*<V2..Vk, V1>:
- If V1:
- If the return is by value, then a lower-bound inference is made.
- If the return is by reference, then an exact inference is made.
- If V2..Vk:
- If the parameter is by value, then an upper-bound inference is made.
- If the parameter is by reference, then an exact inference is made.
This enhancement involves a change of existing words, not just the addition of new words, which complicates things further. That said, if we push this stuff back into §11.6.3.x., we likely can find a way to have two branches for this: with and without unsafe support.
The longer I study the problem, the more I lean towards putting this stuff in §11.6.3.* with suitable unsafe-conditional text. Putting it in §22 makes it stand out, but not positively so, and looks somewhat like the situation we had previously with the grammar in earlier chapters being augmented by unsafe extensions in §22. And we dropped that approach and pushed the unsafe grammar back into the main spec.
Jon's input in private mail:
Can we discuss this at next week's meeting? Aside from anything else, I want to document the pros and cons clearly, and how we weigh them up, so that in the next similar situation we have a record of how we handled this. I do like having all the unsafe aspects in 22, but I can see it being an issue here.
It would be worth asking folks with more knowledge of later (and even unreleased) features how much similar stuff we're looking at over the next few years.
If we do include it in 11.3.6, we should also back-reference it from 22 so that anyone wanting to know "What extra features are available in unsafe code" can just look through 22 and not miss this.
We're leaning towards putting this in 11.6.3 as Rex suggests, mostly because type inference is so complex and easy to get wrong (when both reading and writing...). But this shouldn't be seen as precedent beyond "we can discuss it if we think it's worth violating our normal approach".
We need @MadsTorgersen to sign off on that approach though.
We have agreed to:
- Put the relevant text "inline" (e.g. in 11.6.3) but with a consistent (human and machine readable) label to indicate "unsafe only"
- Reference back from section 22 as appropriate (and validate this before each release, to account for changes in locations etc)
- Rex will create a "demo" of what this might look like before this issue is closed
Here's my demo; actual spec text changes are shown underlined.
I've used the notation **UnsafeMode**: ... **end UnsafeMode** to delimit unsafe-specific text, so it can be found programmatically, but I'm open to changing its spelling. This form almost mirrors that for examples and notes (which are informative), but as this new delimiter is for normative text, I've made it slightly different.
Example 1 (simple)
Here is the unsafe-specific text added to the core chapter:
12.6.3.4 Expressions|Function members|Type inference|Input types
If E is a method group or implicitly typed anonymous function and T is a delegate type or expression tree type then all the parameter types of T are input types of E with type T.
UnsafeMode: If E is an address-of method group and T is a function pointer type, then all the parameter types of T are input types of E with type T. end UnsafeMode:
12.6.3.5 Output types
…
Here is a pointer from the unsafe chapter back to the unsafe-specific text added to the core chapter for this topic:
23.6 Pointers in expressions
23.6.x Type inference
23.6.x.1 Input types
See §12.6.3.4 for the unsafe-context impact on this topic.
Example 2 (non-trivial)
Here is the unsafe-specific text added to the core chapter:
12.6.3.10 Lower-bound inferences
A lower-bound inference from a type U to a type V is made as follows:
- If
Vis one of the unfixedXᵢthenUis added to the set of lower bounds forXᵢ. - Otherwise, if
Vis the typeV₁?andUis the typeU₁?then a lower bound inference is made fromU₁toV₁. - Otherwise, sets
U₁...UₑandV₁...Vₑare determined by checking if any of the following cases apply:-
Vis an array typeV₁[...]andUis an array typeU₁[...]of the same rank -
Vis one ofIEnumerable<V₁>,ICollection<V₁>,IReadOnlyList<V₁>>,IReadOnlyCollection<V₁>orIList<V₁>andUis a single-dimensional array typeU₁[] -
Vis a constructedclass,struct,interfaceordelegatetypeC<V₁...Vₑ>and there is a unique typeC<U₁...Uₑ>such thatU(or, ifUis a typeparameter, its effective base class or any member of its effective interface set) is identical to,inheritsfrom (directly or indirectly), or implements (directly or indirectly)C<U₁...Uₑ>. -
UnsafeMode:
Vis a function pointer typedelegate*<V2..Vk, V1>and there is a function pointer typedelegate*<U2..Uk, U1>such thatUis identical todelegate*<U2..Uk, U1>, and the calling convention ofVis identical toU, and the refness ofViis identical toUi. end UnsafeMode: - (The “uniqueness” restriction means that in the case interface
C<T>{} class U: C<X>, C<Y>{}, then no inference is made when inferring fromUtoC<T>becauseU₁could beXorY.)
If any of these cases apply then an inference is made from eachUᵢto the correspondingVᵢas follows: - If
Uᵢis not known to be a reference type then an exact inference is made; or alternatively, UnsafeMode: IfUis not a function pointer type andUiis not known to be a reference type, or ifUis a function pointer type andUiis not known to be a function pointer type or a reference type, then an exact inference is made end UnsafeMode: - Otherwise, if
Uis an array type then a lower-bound inference is made - Otherwise, if
VisC<V₁...Vₑ>then inference depends on thei-thtype parameter ofC:- If it is covariant then a lower-bound inference is made.
- If it is contravariant then an upper-bound inference is made.
- If it is invariant then an exact inference is made.
-
UnsafeMode: Otherwise, if
Visdelegate*<V2..Vk, V1>then inference depends on the i-th parameter ofdelegate*<V2..Vk, V1>:- If V1:
- If the return is by value, then a lower-bound inference is made.
- If the return is by reference, then an exact inference is made.
- If V2..Vk:
- If the parameter is by value, then an upper-bound inference is made.
- If the parameter is by reference, then an exact inference is made. end UnsafeMode:
- If V1:
-
Here is a pointer from the unsafe chapter back to the unsafe-specific text added to the core chapter for this topic:
23.6 Pointers in expressions
23.6.x Type inference
23.6.x.4 Lower-bound inferences
See §12.6.3.10 for the unsafe-context impact on this topic.
Looks good to me, although I'm not keen on "UnsafeMode" as the label. Let's spitball it and see if we can come up with anything better. (UnsafeSupport?)
Decision on 2024-05-15:
- Keep UnsafeMode as the name for now, but we'll revisit later (possibly to UnsafeContext, but it'll be easier to check that when the work has been done)
- We still want @MadsTorgersen sign-off before doing significantly more work on this
After a short discussion, we agreed to not have a new bracketing label, but, rather, to add a note to the end of the unsafe-code-specific text saying something like, “Note: This is only applicable in unsafe code. end note” And we’ll have a pointer to the core code changes from the unsafe code clause, as previously proposed.
Rex will revise PR #984 accordingly.
PR #https://github.com/dotnet/csharpstandard/pull/984 has been revised to incorporate changes modeled on the resolution of this issue.