variant [Optimization] Don't generate bad_variant_access logic for exhaustive visit using overloaded

Using clang from trunk with -std=c++17 -Oz, the following code uses overloaded with an exhaustive set of lambdas to visit all cases of the variant:

https://godbolt.org/z/k-r72L

#include <https://raw.githubusercontent.com/mpark/variant/single-header/master/variant.hpp>
#include <functional>
#include <string>
#include <stdio.h>

namespace {
template <class... Ts>
struct overloaded : Ts... {
using Ts::operator()...;
};
template <class... Ts>
overloaded(Ts...) -> overloaded<Ts...>;
}

enum class Foo { kValue };
enum class Bar { kValue };
using Callback = std::function<std::string(void)>;

using SumType = mpark::variant<Foo, Bar, Callback>;

void DoStuff(SumType s) {
  mpark::visit(overloaded{
      [](Foo f) { printf("Foo!\n");},
      [](Bar b) { printf("Bar!\n");},
      [](const Callback& c) { printf("Callback: %s\n", c().c_str()); },
  },
  s);
}

Even though the lambdas exhaustively handle all cases, MPark.Variant still generates error handling logic to throw bad_variant_access exceptions (which as far as I can tell should never happen).

I think it'd be a nice optimization to drop the error handling entirely if the visitation is exhaustive.

Here's the current state of the optimized assembly in case this gets fixed:

DoStuff(mpark::variant<Foo, Bar, std::function<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > ()> >): # @DoStuff(mpark::variant<Foo, Bar, std::function<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > ()> >)
        push    r14
        push    rbx
        sub     rsp, 40
        mov     rbx, rdi
        cmp     byte ptr [rdi + 32], -1
        lea     rdi, [rsp + 7]
        sete    byte ptr [rdi]
        push    1
        pop     rsi
        call    mpark::detail::any(std::initializer_list<bool>)
        test    al, al
        jne     .LBB0_8
        movzx   ecx, byte ptr [rbx + 32]
        cmp     rcx, 255
        push    -1
        pop     rax
        cmovne  rax, rcx
        cmp     rax, 2
        je      .LBB0_6
        cmp     rax, 1
        je      .LBB0_5
        mov     edi, offset .Lstr
        jmp     .LBB0_4
.LBB0_5:
        mov     edi, offset .Lstr.4
.LBB0_4:
        call    puts
        jmp     .LBB0_7
.LBB0_6:
        lea     r14, [rsp + 8]
        mov     rdi, r14
        mov     rsi, rbx
        call    std::function<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > ()>::operator()() const
        mov     rsi, qword ptr [r14]
        mov     edi, offset .L.str.3
        xor     eax, eax
        call    printf
        mov     rdi, r14
        call    std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string() [base object destructor]
.LBB0_7:
        add     rsp, 40
        pop     rbx
        pop     r14
        ret
.LBB0_8:
        call    mpark::throw_bad_variant_access()
mpark::detail::any(std::initializer_list<bool>): # @mpark::detail::any(std::initializer_list<bool>)
        xor     eax, eax
.LBB1_1:                                # =>This Inner Loop Header: Depth=1
        cmp     rsi, rax
        je      .LBB1_4
        cmp     byte ptr [rdi + rax], 0
        lea     rax, [rax + 1]
        je      .LBB1_1
        mov     al, 1
        ret
.LBB1_4:
        xor     eax, eax
        ret
mpark::throw_bad_variant_access():  # @mpark::throw_bad_variant_access()
        push    rax
        push    8
        pop     rdi
        call    __cxa_allocate_exception
        mov     qword ptr [rax], offset vtable for mpark::bad_variant_access+16
        mov     esi, offset typeinfo for mpark::bad_variant_access
        mov     edx, offset std::exception::~exception() [base object destructor]
        mov     rdi, rax
        call    __cxa_throw
mpark::bad_variant_access::~bad_variant_access() [deleting destructor]:      # @mpark::bad_variant_access::~bad_variant_access() [deleting destructor]
        push    rbx
        mov     rbx, rdi
        call    std::exception::~exception() [base object destructor]
        mov     rdi, rbx
        pop     rbx
        jmp     operator delete(void*)                  # TAILCALL
mpark::bad_variant_access::what() const:  # @mpark::bad_variant_access::what() const
        mov     eax, offset .L.str
        ret
std::function<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > ()>::operator()() const: # @std::function<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > ()>::operator()() const
        push    rbx
        cmp     qword ptr [rsi + 16], 0
        je      .LBB5_2
        mov     rbx, rdi
        call    qword ptr [rsi + 24]
        mov     rax, rbx
        pop     rbx
        ret
.LBB5_2:
        call    std::__throw_bad_function_call()
typeinfo name for mpark::bad_variant_access:
        .asciz  "N5mpark18bad_variant_accessE"

typeinfo for mpark::bad_variant_access:
        .quad   vtable for __cxxabiv1::__si_class_type_info+16
        .quad   typeinfo name for mpark::bad_variant_access
        .quad   typeinfo for std::exception

vtable for mpark::bad_variant_access:
        .quad   0
        .quad   typeinfo for mpark::bad_variant_access
        .quad   std::exception::~exception() [base object destructor]
        .quad   mpark::bad_variant_access::~bad_variant_access() [deleting destructor]
        .quad   mpark::bad_variant_access::what() const

.L.str:
        .asciz  "bad_variant_access"

.L.str.3:
        .asciz  "Callback: %s\n"

.Lstr:
        .asciz  "Foo!"

.Lstr.4:
        .asciz  "Bar!"

Feb 28 '20 17:02 bhamiltoncx

I just learned about valueless_by_exception:

https://en.cppreference.com/w/cpp/utility/variant/valueless_by_exception

so I assume this logic is to implicitly deal with that case — if so, we can close this issue.

Feb 28 '20 20:02 bhamiltoncx

If it is possible to detect that these two cases from the spec are noexcept for all the types in the variant (maybe using the noexcept() operator — this would only work for C++17, since the noexcept specification is not part of the function type until then), we could probably elide the bad_variant_access logic in this case.

(guaranteed) an exception is thrown during the move initialization of the contained value from the temporary in copy assignment
(guaranteed) an exception is thrown during the move initialization of the contained value during move assignment

Feb 28 '20 20:02 bhamiltoncx

I believe the problem lies in the fact that any type that has a conversion operator to one of the types the variant holds can throw an exception during the conversion, and since that's something that happens independently from the visitation, I can't see how visit would possibly detect that.

One such examples is given by cppreference itself:

struct S {
    operator int() { throw 42; }
};
std::variant<float, int> v{12.f}; // OK
v.emplace<1>(S()); // v may be valueless

Feb 16 '21 17:02 falemagn

Could we only do this optimization when exceptions are disabled?

On Tue, Feb 16, 2021, 10:30 Fabio [email protected] wrote:

I believe the problem lies in the fact that any type that has a conversion operator to one of the types the variant holds can throw an exception during the conversion, and since that's something that happens independently from the variant, I can't see how the variant would possibly detect that.

One such examples is given by cppreference https://en.cppreference.com/w/cpp/utility/variant/valueless_by_exception itself:

struct S { operator int() { throw 42; } }; std::variant<float, int> v{12.f}; // OK v.emplace<1>(S()); // v may be valueless

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/mpark/variant/issues/74#issuecomment-779996793, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFEAU2YFUCSCZGKK5OPLV3S7KTTFANCNFSM4K5V4TEQ .

Feb 16 '21 19:02 bhamiltoncx

Could we only do this optimization when exceptions are disabled?

Even though they're disabled in the current compilation unit, they might not haven been disabled in the compilation unit that handed you the variant you want to visit, so I guess that's not a check that can be performed.

However, a recent Twitter discussion brought to my attention an optimization that has been performed in libstdc++: trivially copyable types aren't able to now put the variant in a valueless state. If mpark's variant did the same thing, then mpark's visit could effectively perform the optimization you suggested, but only for those variants that are made of only trivially copyable types.

Here's the discussion: https://twitter.com/BarryRevzin/status/1361826113543110657

Feb 17 '21 17:02 falemagn

I mean, if -fno-exceptions is set in the compilation unit which #includes variant.hpp, it's not possible for variant.hpp to throw the bad_variant_access() exception, right? But it still tries to do so:

https://godbolt.org/z/PfPdhn

Feb 18 '21 20:02 bhamiltoncx

I guess what I'm saying is, if I know in my particular environment that it's not possible for the variant to be valueless, I expect an exhaustive std::visit to be equivalent to a switch on an enum (possibly with a __builtin_unreached() or equivalent afterwards).

On Thu, Feb 18, 2021 at 1:13 PM Ben Hamilton [email protected] wrote:

I mean, if -fno-exceptions is set in the compilation unit which #includes variant.hpp, it's not possible for variant.hpp to throw the bad_variant_access() exception, right? But it still tries to do so:

https://godbolt.org/z/PfPdhn

Feb 18 '21 20:02 bhamiltoncx

Huh! I just noticed that somewhere between clang 8 and clang 9, the std::variant implementation got a whole lot more optimized, and it somehow elides all the logic for std::bad_variant_access, even if I don't pass -fno-exceptions:

https://godbolt.org/z/8bT6Ed

So, I guess we can just use std::variant and be happy now. :)

On Thu, Feb 18, 2021 at 1:26 PM Ben Hamilton [email protected] wrote:

I guess what I'm saying is, if I know in my particular environment that it's not possible for the variant to be valueless, I expect an exhaustive std::visit to be equivalent to a switch on an enum (possibly with a __builtin_unreached() or equivalent afterwards).

On Thu, Feb 18, 2021 at 1:13 PM Ben Hamilton [email protected] wrote:

I mean, if -fno-exceptions is set in the compilation unit which #includes variant.hpp, it's not possible for variant.hpp to throw the bad_variant_access() exception, right? But it still tries to do so:

https://godbolt.org/z/PfPdhn

Feb 18 '21 21:02 bhamiltoncx