[SUGGESTION][FIX] Add narrowing 'as' cast and static assert failure instead of nonesuch…
The current implementation allows for writing below code
args : std::span<Arg> = (argv, argc as std::size_t);
c : std::ifstream = args.back(); // <= here it crashes
It fails as as<std::size_t>(argc) matches as(...) that returns nonesuch which is nullptr.
This change introduces the specialization of the as() function to match arithmetic types (it is based on the gsl::narrow() function). It also removes the version of the as(...) function that silently returns nullptr - now it fails at compile time thanks to static_assert.
After this change below code will compile and will accept some of the narrowing casts (forbidden casts will throw).
main: () -> int = {
max_uint8 := std::numeric_limits<std::uint8_t>::max();
max_int8 := std::numeric_limits<std::int8_t>::max();
max_double := std::numeric_limits<double>::max();
max_float := std::numeric_limits<float>::max();
v1 := max_uint8 as std::int8_t; // throws narrowing_error
v2 := -1 as std::uint8_t; // throws narrowing_error
v3 := -1.0 as std::uint8_t; // throws narrowing_error
v4 := max_double as float; // throws narrowing_error
v5 := -1.0 as long; // OK
v6 := -1 as double; // OK
v7 := max_float as double; // OK
v8 := max_uint8 as std::uint16_t; // OK
v9 := max_int8 as std::uint16_t; // OK
v10 := 10 as std::size_t; // OK
}
That will also fix below code:
args : std::span<Arg> = (argv, argc as std::size_t);
c : std::ifstream = args.back(); // no crash, works as expected
Closes https://github.com/hsutter/cppfront/issues/104
Working prototype: https://godbolt.org/z/Psh9MG1cY
v1 := max_uint8 as std::int8_t; // throws narrowing_error
That seems like a departure from https://wg21.link/P2392, which matches
Otherwise, if x can be bound to a refto(T,x), then x as T means a refto(T,x) bound to x.
I think you can confirm this with Circle.
Thanks! General timing note (maybe I should paste/link this in a few other PRs/issues if this isn't visible enough):
My responses will be delayed because of the Kona meeting next week, which will be the first face-to-face ISO C++ meeting in nearly three years, and the week before and after the meeting are generally also consumed with travel and committee-related work. I plan to be back to a normal schedule after that and will be catching up with PRs and issues then. Thanks!
But quick ack on this thread:
My current thought is to require narrow or narrow_cast (semantically or syntactically) for narrowing conversions, and is one of the things still to be added to the initial partial as implementation. Major design choices:
- Require the explicit "narrow" syntax because potentially-lossy conversions should stand out, and make
asnot support narrowing conversions. - The one being suggested here, which is to have
asdo anarrowwhich throws on data loss, but this adds a branch and makes the operation throwing. - Have
asdo anarrow_castwhich is a "trust me" operation.
Some considerations:
- (1) and (2) are safe by default.
- (2) and (3) may be friendlier for generic code, if experience shows that allowing
as Tis generally useful even when potentially narrowing.
That currently tilts things in favor of (2) in my mind, which is the direction of this PR, but I'd be happy to see usability evidence/examples arguing for any these options as we consider this PR.
@JohelEGP P2392 is a living document that will be updated based on this and other experience (and committee feedback) and can be expected to be a little out of sync sometimes. But for max_uint8 as std::int8_t I don't think the bullet you quoted relates because you can't bind a uint8_t to an int8_t& (in Cpp1 syntax)? But I'm writing this quickly and may be missing something.
Thanks for the feedback. I assumed that you are preparing for the Kona meeting and I am not expecting that you will process these PRs at the moment. I hope I will be able to join one of the committee meetings in the future.
Regarding this PR. I like the cleanliness of the x as T syntax (easy to learn and easy to read). I agree that narrowing operations shall be visible and shall stand out. I was thinking about making it
x narrowed_to T; // or something that you can easily read
The good thing is that we can have static_assert that can inform not to use the as keyword for narrowing cases and to use narrowed_to (similar case as with pointer arithmetic where we recommend using std::span or gsl::span). That approach will make writing generic code harder but will guide the programmer and will stand out like this:
args : std::span<Arg> = (argv, argc narrowed_to std::size_t);
c : std::ifstream = args.back();
x narrowed_to T;
Maybe x narrowed_as T; is better? or x lossy_as T;?
We can brainstorm on the keyword name to make it clear. Let's first agree on the general idea of how to tackle that cases.
Require the explicit "narrow" syntax because potentially-lossy conversions should stand out, and make
asnot support narrowing conversions.
The one being suggested here, which is to have
asdo anarrowwhich throws on data loss, but this adds a branch and makes the operation throwing.Have
asdo anarrow_castwhich is a "trust me" operation.
I have a broader language suggestion that may be useful in more cases than just narrow casts:
Maybe we could indicate whether we want to throw on mismatch, static assert, or return an optional instead by adding ? for optional and ! for static assert (leaving throwing as default)?
e.g. x as? T; would return optional which is empty if no match exists, x as! T; would fail at compile time if no match exists (or if x's value is not defined at compile time), and x as T; would throw an exception if no match exists.
The motivation for having built-in defaults for matching when no pattern applies is easily delegating control over what happens to the caller.
In the context of this problem, this would mean we just match on cases where there's no data loss, and have the caller decide if they want an optional, a thrown exception, or a static assertion in case there's data loss by selecting the type of match in the calling site.
I think it makes sense for
v1 := max_uint8 as std::int8_t; // throws narrowing_error
to use narrow instead. After all, it's safe, just like when casting std::variants and class hierarchies with as. Perhaps a difference is that the latter two the compiler can optimize when the as is preceded by a corresponding is check. Would the same work with v1? Looking at the paper, in a generic context, is would do something different, as it matches
Otherwise, if C(x) is valid and convertible to bool, then x is C means C(x).
@JohelEGP P2392 is a living document that will be updated based on this and other experience (and committee feedback) and can be expected to be a little out of sync sometimes. But for
max_uint8 as std::int8_tI don't think the bullet you quoted relates because you can't bind auint8_tto anint8_t&(in Cpp1 syntax)? But I'm writing this quickly and may be missing something.
You're right. I think it actually matches
Otherwise, if typeof(x) is implicitly convertible to T, then x as T means to convert x to an rvalue of type T (e.g., including the case where both are Pointer types and this is a static upcast).
Thanks to @feature-engineer I decided to play with exclamation marks and question marks. I don't like as! or as? - what I like with the pure as is that you can read it and it will make sense:
pass(x as std::uint16_t);
I decided to try ! and ? at the end.
So if you want to be strict about casting you can write code adding exclamation marks after the type:
main: () -> int = {
max_uint8 := std::numeric_limits<std::uint8_t>::max();
max_int8 := std::numeric_limits<std::int8_t>::max();
max_double := std::numeric_limits<double>::max();
max_float := std::numeric_limits<float>::max();
v := max_uint8 as std::uint16_t!; // OK
v1 := max_uint8 as std::int8_t!; // compile-time error
v2 := -1 as std::uint8_t!; // compile-time error
v3 := -1.0 as std::uint8_t!; // compile-time error
v4 := max_double as float!; // compile-time error
v5 := -1.0 as long!; // compile-time error
v6 := -1 as double!; // compile-time error
v7 := max_float as double!; // OK
v8 := max_uint8 as std::uint16_t!; // OK
v9 := max_int8 as std::uint16_t!; // compile-time error
v10 := 10 as std::size_t!; // compile-time error
v11 := max_uint8 as std::int16_t!; // OK
}
If you don't want static asserts but you don't want to have exceptions you can use question marks to get std::optional:
main: () -> int = {
max_uint8 := std::numeric_limits<std::uint8_t>::max();
max_int8 := std::numeric_limits<std::int8_t>::max();
max_double := std::numeric_limits<double>::max();
max_float := std::numeric_limits<float>::max();
v := max_uint8 as std::uint16_t?; // OK, 255
v1 := max_uint8 as std::int8_t?; // OK, (empty)
v2 := -1 as std::uint8_t?; // OK, (empty)
v3 := -1.0 as std::uint8_t?; // OK, (empty)
v4 := max_double as float?; // OK, (empty)
v5 := -1.0 as long?; // OK, -1
v6 := -1 as double?; // OK, -1.000000
v7 := max_float as double?; // OK, 340282346638528859811704183484516925440.000000
v8 := max_uint8 as std::uint16_t?; // OK, 255
v9 := max_int8 as std::uint16_t?; // OK, 127
v10 := 10 as std::size_t?; // OK, 10
v11 := max_uint8 as std::int16_t?; // OK, 255
std::cout << "v = (v)$" << std::endl;
std::cout << "v1 = (v1)$" << std::endl;
std::cout << "v2 = (v2)$" << std::endl;
std::cout << "v3 = (v3)$" << std::endl;
std::cout << "v4 = (v4)$" << std::endl;
std::cout << "v5 = (v5)$" << std::endl;
std::cout << "v6 = (v6)$" << std::endl;
std::cout << "v7 = (v7)$" << std::endl;
std::cout << "v8 = (v8)$" << std::endl;
std::cout << "v9 = (v9)$" << std::endl;
std::cout << "v10 = (v10)$" << std::endl;
std::cout << "v11 = (v11)$" << std::endl;
}
If you don't want to have static_assert checks nor use std::optional you can stay with just plain as and you will get a runtime check with an exception on narrowing error.
Currently, optional_as is using exceptions (captures them) - it is only a prototype to play with it. We shall use std::expected or something similar to handle narrowing error cases and throw only in specific cases.
My only doubt is that there is a very subtle difference between these cases:
pass(x as std::uint8_t?);
pass(x as std::uint8_t!);
pass(x as std::uint8_t);
probably a better option would be to have something like
pass(x as_optional std::uint8_t);
pass(x as_strict std::uint8_t);
pass(x as std::uint8_t);
or
pass(x as optional<std::uint8_t>);
pass(x as strict<std::uint8_t>);
pass(x as std::uint8_t);
PS: have you noticed how easy it is to print std::optional - it is awesome!
I don't like the ! and ? after the type, too easy to miss. I was thinking along the lines of x as optional int gives std::optional<int>, x as expected int gives std::expected<int>, x as narrow int allows narrowing conversions.
My understanding of the future of exceptions (per https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p0709r4.pdf ) is that std::expected is not going to be a recommended tool. So enshrining it in the language itself might be counterproductive.
Ok. Will read that through.
Continuing my experiments with the syntax I have pushed the new update that allows using two syntaxes:
pass(x as std::uint8_t?);
pass(x as std::uint8_t!);
pass(x as std::uint8_t);
or
pass(x as std::optional<std::uint8_t>);
pass(x as cpp2::strict<std::uint8_t>);
pass(x as std::uint8_t);
Taking a look at the possible options and current experiments probably strict mode shall be the default one as it is not narrowing.
Possibly narrowing cast shall be emphasized instead of a strict case so maybe we should change it to something similar to:
pass(x as std::optional<std::uint8_t>);
pass(x as std::uint8_t); // static_assert failure if possibly dangerous
pass(x as cpp2::narrowing<std::uint8_t>);
My understanding of the future of exceptions (per https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p0709r4.pdf ) is that
std::expectedis not going to be a recommended tool. So enshrining it in the language itself might be counterproductive.
I don't think your assessment is correct - according to https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2022/n4907.pdf, P0323R12 (std::expected) is going to be included in the language for C++23.
It seems that it'd be quite some time before p0709, or P2232, or some other yet to be proposed proposal are going to deprecate std::expected.
I guess it's just a matter of how forward you look. The paper I linked is clear about what you said:
In the committee, we are advancing the proposed std::experimental::expected<SuccessResult,ErrorResult> [P0323R3]. As noted in the paper: “C++ already supports exceptions and error codes, expected would be a third kind of error handling.”
But that paper is Herb writing 60 pages about how he dislikes adding (with std::expected) yet another way of handling errors, and would like instead a unification that covers all requirements.
Thanks to @feature-engineer I decided to play with exclamation marks and question marks. I don't like
as!oras?- what I like with the pureasis that you can read it and it will make sense:
Hey, very interesting discussion. Wondering why you don't like them? They are directly at the operator that does the action so they seem very clear and upfront to me. to me putting them at the end feels like trying to form normal language sentences which feels like a bit of an illusion: they are operators in a mathematical sense. I kinda find them very beautiful as they are very short and to the point yet very expressive. a bit in contrast to the second syntax proposal you have that seems to reintroduce the noise that these operators reduce.
Yes, you are right. I did the experiments and came to that conclusion as well.
To be honest I am waiting for a feedback from Herb as I don't know if he has something in mind regarding this feature.
Probably I can make all these syntaxes working. I am currently waiting to clarify the defaults and how to distinguish other cases from the default behavior.
One thing I have learned is that we probably want to have default version that is statically checked on compile time. If that is not working than we shall be able to choose if we want to have exception or e.g. optional.
I guess it's just a matter of how forward you look. The paper I linked is clear about what you said:
In the committee, we are advancing the proposed std::experimental::expected<SuccessResult,ErrorResult> [P0323R3]. As noted in the paper: “C++ already supports exceptions and error codes, expected would be a third kind of error handling.”
But that paper is Herb writing 60 pages about how he dislikes adding (with
std::expected) yet another way of handling errors, and would like instead a unification that covers all requirements.
From my understanding, std::expected is created as the "modern" way of handling error codes, instead of using a C-style out parameter. While it is an alternative to exceptions, i personally see a case for both types of error handling: exceptions being used for "unexpected" error conditions, while error codes being used when the error condition is "expected".
Ex. when connecting to a web socket, you expect to be returned some kind of a handle object, however you also may expect an error code. IMO error codes fit this use-case more since a network error is an actually expected (although erroneous) result. Normally, you would use a nullable return value or std::optional and an out parameter for the error code, but now you can handle both with std::expected.
Although, granted, out std::expected prevents RVO if you need to further work with the stored value.
The way I read that proposal I linked, "unexpected errors" as in "programmer errors" should be asserting, rather than throwing exceptions that propagate all the way out and terminate the program there.
If you're programmaticaly handling errors, they are by definition errors that you anticipate could happen. Like your socket example. I can't do what I wanted to do (normal control flow), but I can do something about the error. The proposal argues in several places, and I agree, that normal and error control flow in C++ should be separate. I think separable is a better way to call it (as nothing ever stops you from handling errors immediately next to the call site). It has to be that way for constructors, and constructors are central for RAII. And it has to be that way to let you do compositions as simple as chaining methods, f(g(x)).
That said, the new exceptions mechanism proposed is very much like "using std::expected behind the scenes", just with language support to act like exceptions. The details on that behind-the-scenes are in section 4.1.1. Section 4.1.6 shows the new mechanism side-by-side as a replacement for std::expected, and section 5 says one way to measure the success of the new exceptions would be if they actually replace the use of std::expected for handling errors.
So maybe the best course for cppfront is to enshrine std::expected as the way to return errors in the generated C++ code. And then eventually to have the new syntax hide it.
So maybe the best course for cppfront is to enshrine std::expected as the way to return errors in the generated C++ code. And then eventually to have the new syntax hide it.
I would definitely not want to have some kind of 3d way of handling runtime errors in addition to classic exceptions and std::expected, so this approach seems to be the most optimal as of now.
Perhaps opening a new issue would be best? @jcanizales
"unexpected errors" as in "programmer errors" should be asserting, rather than throwing exceptions that propagate all the way out and terminate the program there
Personally, I think classic exceptions are better in this case, since they handle RAII and can be used to "catch all non-fatal errors".
Ex. in a game engine's editor you could catch all non-fatal exceptions and log the error message, then pause the game and allow the user to fix it without crashing.
Assertions are great for debugging, but do not allow you to preform this kind of error handling and may cause resource leaks because of unhandled RAII destructors.
Classic exceptions are already forbidden in many types of codebases, for the range of issues the paper describes. The point of the proposal is to fix those issues (like std::expected does) but without losing the advantages they have.
When I wrote "programmer error" I meant "the code is buggy" errors. The user can't recover from that, and trying to recover programmatically is problematic because there's no knowing how much of your program state is already corrupted by the time you detect the issue. Assertion libraries like glog usually do print an error message and a stack trace before terminating the program. Chrome survives assertions in tabs without crashing by isolating each in its own OS process. That said, like with all guidance I'm sure there's no shortage of cases in which it's not the best. New exceptions are also better for this than old exceptions in any case.
I was not thinking about runtime assertions but static assertions that will make your build to fail. No impact on runtime error handling. It should prevent developer from writing potentially buggy code. Additionally it should provide guidance what to do instead.
I have done more experiments and tests around that topic. I like the idea of having default as as strict as possible that ends with a static assertion if it is a suspicious expression that may not work.
Currently, it works like that:
template< typename C, typename... Ts >
auto as(Ts... ts) -> auto {
static_assert(program_violates_type_safety_guarantee<Ts...>, "safe 'as' cast is not defined");
}
I realized that this interfered with the inspect implementation as there are used constexpr ifs and requires expressions that are not working when used on expression that ends up with static assertions.
In regression-tests/pure2-inspect-expression-in-generic-function-multiple-types.cpp2 file there is a code:
inspect x -> std::string {
is int = "integer " + std::to_string(x as int);
is std::string = '"' + x as std::string + '"';
is _ = "not an int or string";
}
is int = "integer " + std::to_string(x as int); line compiles to (the original code is oneliner I refortmated it for easier reading)
if (cpp2::is<int>(__expr)) {
if constexpr( requires{ "integer " + std::to_string(cpp2::as<int>(x)); } )
if constexpr( std::is_convertible_v<CPP2_TYPEOF( "integer " + std::to_string( cpp2::as<int>(x) ) ),
std::string> )
return "integer " + std::to_string(cpp2::as<int>(x));
else
return std::string{};
else
return std::string{};
}
If cpp2::as<int>(x) will end up with static assert requires expression will not return false and compilation will fail. I have made an workaround for it - I am preparing the clean commit to update this PR.
TLDR: I have renamed all cpp2::as function to cpp2::internal_as and added additional template argument that specify failure policy (static assertions or throwing expceptions). internal_as is called by one as function with default policy:
enum class cast_failure_policy {
static_assertions,
throws
};
// ...
template< typename C, typename X >
auto as(X const& x) -> decltype(auto) {
return internal_as<cast_failure_policy::static_assertions, C>(x);
}
cppfront emits statement to the string that is later printed using that string:
return_prefix = "{ if constexpr( requires{" + statement + ";} ) if constexpr( std::is_convertible_v<CPP2_TYPEOF((" + statement + "))," + result_type + "> ) return ";
In statement string I am replacing all cpp2::as to cpp2::internal_as<cpp2::cast_failure_policy::throws, - when expression throws inside requires call it will end up with false.
auto internal_statement = replace_all(statement, "cpp2::as<", "cpp2::internal_as<cpp2::cast_failure_policy::throws, ");
@hsutter I have made it a draft to emphasize that I am working on it.
rebased to main
Thanks, everyone. Now that I've had time to reread this thread, I see @filipsajdak narrowed (pun intended) this down to several suggestions, one of which is the one I was thinking of.
Briefly: What adding ? would entail
If we allow x as? T to mean an optional<T>, then
- If we use a
?it seems more natural for it to be on the type, i.e.,x as T?. That would make sense if we wanted to pursue the concept of an optional type as a language feature hardwired throughout the type system... but I'm not inclined to do that because I think would be a major invasive change. C# 2.0 did something similar when it added aNullablelanguage supported concept also spelledT?, and the experience I've heard from C# designers was that it was way more work than expected and complicated the type system and compiler, and in retrospect is not clear it was a good idea to addT?throughout the type system. Besides, ... - We already have a way to spell that, namely
x as optional<T>. Let's try that.
Which brings me to the same idea that @filipsajdak included as one of his options...
Spelling "safe as"
// I like this
x as std::uint8_t // static_assert failure if possibly dangerous
That makes sense to me, especially with a nice error message to "use fallible-as or lossy-as" if you really want to do this.
Spelling "fallible as"
// I like this
x as std::optional<std::uint8_t>
I like this direction. Note this doesn't require adding new ?/! grammar or a reporting mode enum configuration parameter or anything complex like that. It should work fine with the current compiler just by adding overloads of as<optional<T>>(x) in cpp2util.h.
Spelling "lossy as"
Note that this case doesn't just come up with narrowing conversions, but also with (other) slicing that loses part of a value. Derived-to-base slicing is the most well-known, and yes we tell people not to make base classes copyable, but (a) they do and (b) maybe it can even sometimes make sense to slice off part of a larger derived-type value.
As long as it's more visible and opt-in, I think that's fine... so perhaps if we have this:
// Hypothetical, but a bit inconsistent (see next paragraph)
x as narrow_to<std::uint8_t>
we could implement it as overloads of auto as<cpp2::narrow_to<T>>(x) -> T ... but note the expression's type should be T, which is different from all the other as cases where as<T> yields a T, and that's one reason at least for now I'd be happy to make this not be as at all but leave it as something uglier and more visibly different like GSL already has,
// I like this, and GSL has it already as "narrow_cast"
unsafe_narrow<std::uint8_t>(x)
Naming it differently also preserves "as is safe" as a stake in the ground.
Additional value constraints?
In some (many?) cases, the calling code actually has more information on the valid/expected value range than just "fits into a std::uint8_t"... there may be business logic limits/constraints, so perhaps we should allow optional min, max parameters that default to the full value range? Instead of writing an as preceded/followed by explicit tests for the business logic limits value range, perhaps allowing the programmer to express it in one shot would make it convenient and easy for programmers to do the right thing.
Example:
// let's say that x and y shouldn't only fit into a uint8_t,
// but x should be >=10 and y should be <=127
// without (min,max)
a := x as std::optional<uint8_t>; // fallible as
if a && a.value() >= 10 {
/*...*/
}
b := unsafe_narrow<uint8_t>(y); // lossy "as"
if b && b <= 127 {
/*...*/
}
// with (min,max)
a := x as std::optional<uint8_t>(10 /*, default max value*/ ); // fallible as
if a {
/*...*/
}
b := unsafe_narrow<uint8_t>(y,0,127); // lossy "as"
if b {
/*...*/
}
... and now the interface of unsafe_narrow converges with that of C++17 std::clamp which is interesting and makes a lot of sense to me, since narrowing really is (or heavily overlaps with) the concept of value clamping.
We already have a way to spell that, namely
x as optional<T>. Let's try that.
It should work fine with the current compiler just by adding overloads of
as<optional<T>>(x)incpp2util.h.
This is all about some fundamental types, right? What about UDTs? Rather than throwing constructors, some domains use named functions that report failure. In Cpp2, those could be spelled x as optional<T>.
But then I think about generic programming, and how std::optional isn't always enough to report failure.
This seems like it could create a generic programming pitfall by giving the as expression different meanings depending on the arguments.
Naming it differently also preserves "
asis safe" as a stake in the ground.
That can't work, right? is and as being overloadable by necessity, because it needs users to define semantics, means it can do anything within the constraints of the language and the feature themselves.
Spelling "lossy as"
Both this "lossy as" and the previous "fallible as" could be spelled x as fallible<T>, where the fallible overload returns a std::optional<T> (or a more appropriate failure type for program-defined specializations). Or, perhaps, make use of the support for "throwing values".
I'm not sure over-overloading as is a good idea, in face of the compile-times impact of operator<< and invoke_tag. Constructors are named after the type, which limits the overloads. @hsutter presented something like "all functions that return a T can be used to construct a T". So perhaps Cpp2 can also improve this use-case.
This seems like it could create a generic programming pitfall by giving the
asexpression different meanings depending on the arguments.
There are 2 ways the meaning could change. x as std::optional<T> is disallowed if T is not a fundamental type to which is given a special meaning, or rather than being disallowed, it has today's meaning. The former could result in the use of std::optional deep in a library to fail, while the latter would, in the same case, change the meaning from the intuitive std::optional<T>(x) to the special meaning.
A special tag like fallible could prevent this. It also sounds more like an error situation, making it more apparent that it is for the user to handle.
What would x is fallible<T> do?