mp11 Improve compilation speed by taking advantage of memoization

I put in bulk the benchmarks and tests that can be found in each commit (issue #77)

(result of /usr/bin/time --format='%Es - %MK' $compiler ...)

`mp_transform_if`

compiler	gcc-12	clang-15
before	0:00.13s - 60432K	0:00.16s - 100956K
after	0:00.11s - 47300K	0:00.16s - 100484K

template<class T> using p = mp_bool<T::value & 1>;
template<class T> using f = mp_size_t<T::value + 1>;
template<class I> using test = mp_transform_if<p, f, mp_iota<I>>;

using r1 = mp_transform<test, mp_iota_c<50>>;

`mp_filter`

compiler	gcc-12	clang-15
before	0:00.16s - 72460K	0:00.16s - 102068K
after	0:00.14s - 61884K	0:00.16s - 101828K

template<class T> using p = mp_bool<T::value & 1>;
template<class I> using test = mp_filter<p, mp_iota<I>>;

using r1 = mp_transform<test, mp_iota_c<50>>;

`mp_drop`

compiler	gcc-12	clang-15
before	0:00.48s - 192464K	0:00.48s - 154480K
after	0:00.43s - 172472K	0:00.46s - 150404K

template<class L> struct f { template<class I> using g = mp_drop<L, I>; };
template<class I, class L = mp_iota<I>> using test = mp_transform<f<L>::template g, L>;

using r1 = mp_transform<test, mp_iota_c<50>>;

`mp_sort`

compiler	gcc-12	clang-15
before	0:00.46s - 198848K	0:00.49s - 132632K
after	0:00.42s - 183384K	0:00.48s - 131748K

template<class I> using test = mp_sort<mp_iota<I>, mp_less>;

using r1 = mp_transform<test, mp_iota_c<25>>;

`mp_replace_at`

compiler	gcc-12	clang-15
before	0:00.34s - 134428K	0:00.33s - 124720K
after	0:00.29s - 116920K	0:00.32s - 122568K

template<class I> using f = mp_replace_at<mp_iota<I>, I, void>;
template<class I> using test = mp_transform<f, mp_iota<I>>;

using r1 = mp_transform<test, mp_iota_c<50>>;

`mp_power_set`

compiler	gcc-12	clang-15
before	0:01.35s - 538732K	0:00.65s - 203924K
after	0:01.02s - 396032K	0:00.65s - 203080K

template<class I> using test = mp_power_set<mp_iota<I>>;

using r1 = mp_transform<test, mp_iota_c<15>>;

`mp_map_find`

compiler	gcc-12	clang-15
before	0:00.29s - 128600K	0:00.25s - 120604K
after	0:00.22s - 99908K	0:00.23s - 116084K

template<class L, class M = mp_transform<mp_list, L>>
struct f { template<class I> using g = mp_map_find<M, I>; };

template<class I, class L = mp_iota<I>> using test
  = mp_transform<f<L>::template g, L>;

using r1 = mp_transform<test, mp_iota_c<50>>;

`mp_map_update`

compiler	gcc-12	clang-15
before	0:00.73s - 270224K	0:00.54s - 156112K
after	0:00.19s - 85004K	0:00.29s - 119232K

template<class L, class M = mp_transform<mp_list, L>>
struct f { template<class I> using g = mp_map_update<M, mp_list<I>, mp_list>; };

template<class I, class L = mp_iota<I>> using test
  = mp_transform<f<L>::template g, L>;

using r1 = mp_transform<test, mp_iota_c<20>>;

`mp_map_erase`

compiler	gcc-12	clang-15
before	0:00.52s - 197128K	0:00.44s - 140780K
after	0:00.47s - 183968K	0:00.43s - 139336K

template<class L, class M = mp_transform<mp_list, L>>
struct f { template<class I> using g = mp_map_erase<M, I>; };

template<class I, class L = mp_iota<I>> using test
  = mp_transform<f<L>::template g, L>;

using r1 = mp_transform<test, mp_iota_c<30>>;

`mp_map_replace`

compiler	gcc-12	clang-15
before	0:00.46s - 187160K	0:00.38s - 129104K
after	0:00.21s - 80236K	0:00.33s - 123644K

template<class L, class M = mp_transform<mp_list, L>>
struct f { template<class I> using g = mp_map_replace<M, mp_list<I, I>>; };

template<class I, class L = mp_iota<I>> using test
  = mp_transform<f<L>::template g, L>;

using r1 = mp_transform<test, mp_iota_c<30>>;

Jun 20 '23 02:06 jonathanpoelen

@pdimov would you be able to take a look, please?

Nov 01 '25 22:11 jonathanpoelen

If you use fn as the name of the inner alias, instead of _f et al, these helpers will become quoted metafunctions and you'll be able to use the _q variants of the algorithms instead of spelling the ::template _f.

Nov 02 '25 00:11 pdimov

This looks good to me. I can only wonder why I failed to look at it two years ago when you submitted it originally; it probably came in an inopportune time when I was too busy.

I see a couple more things. It looks like you started to also optimize mp_map_replace_impl by moving the msvc workaround into an ifdef, but then didn't extract _f into a helper template.

Also, in the two cases where the _impl class is reduced to just having a single using type = ... (mp_transform_if_impl and mp_map_erase_impl), you can just remove the _impl class altogether and move the ... part into the primary alias.

The 1.90 beta cutoff is the day after tomorrow, so I won't be able to merge this now, but to make sure I don't drop the ball again (sorry), please remind me after 1.90 is released.

And thank you for this work.

Nov 03 '25 09:11 pdimov

I see a couple more things. It looks like you started to also optimize mp_map_replace_impl by moving the msvc workaround into an ifdef, but then didn't extract _f into a helper template.

I went back to msvc because it wasn't working: boostorg/mp11@87719f69d2eee3ae1e4dd686c06507f77b14fead. However, after two years, the logs are no longer available.

Also, in the two cases where the _impl class is reduced to just having a single using type = ... (mp_transform_if_impl and mp_map_erase_impl), you can just remove the _impl class altogether and move the ... part into the primary alias.

Done.

The 1.90 beta cutoff is the day after tomorrow, so I won't be able to merge this now, but to make sure I don't drop the ball again (sorry), please remind me after 1.90 is released.

I will try to remember :)

Nov 08 '25 18:11 jonathanpoelen

I took the opportunity to go over what I had missed and use a gcc builtin for make_integer_sequence.

`mp_nth_element`

compiler	gcc-15	clang-20
before	0:00.66s - 206316K	0:01.77s - 246832K
after	0:00.63s - 192944K	0:01.76s - 245956K

template<class I> using test
  = mp_nth_element<mp_iota_c<I::value+1>, I, mp_less>;

using r1 = mp_transform<test, mp_iota_c<30>>;

`mp_map_find_impl` workaround (GCC)

compiler	gcc-15
before	0:01.65s - 381364K
after	0:00.99s - 213064K

template<class L, class M = mp_transform<mp_list, L>>
struct f { template<class I> using g = mp_map_find<M, I>; };

template<class I, class L = mp_iota<I>> using test
  = mp_transform<f<L>::template g, L>;

using r1 = mp_transform<test, mp_iota_c<50>>;

`mp_transform` with 5 lists or more

compiler	gcc-15	clang-20
before	0:00.21s - 80328K	0:00.29s - 117756K
after	0:00.20s - 77408K	0:00.29s - 117468K

template<class I, class L = mp_iota<I>> using test
  = mp_transform<mp_list, L, L, L, L, L, L, L, L>;

using r1 = mp_transform<test, mp_iota_c<50>>;

`mp_fill`

compiler	gcc-15	clang-20
before	0:00.15s - 60740K	0:00.14s - 94144K
after	0:00.10s - 43504K	0:00.14s - 94188K

template<class I, class L = mp_iota<I>> using test
  = mp_fill<L, int>;

using r1 = mp_transform<test, mp_iota_c<100>>;

`mp_replace_if`

compiler	gcc-15	clang-20
before	0:00.23s - 76088K	0:00.37s - 116984K
after	0:00.22s - 75348K	0:00.35s - 114952K

template<class> using pred = mp_false;
template<class I, class L = mp_iota<I>> using test
  = mp_replace_if<L, pred, I>;

using r1 = mp_transform<test, mp_iota_c<120>>;

`mp_replace`

compiler	gcc-15	clang-20
before	0:00.30s - 90020K	0:00.48s - 129272K
after	0:00.29s - 89428K	0:00.46s - 126580K

template<class I, class L = mp_iota<I>> using test
  = mp_replace<L, I, I>;

using r1 = mp_transform<test, mp_iota_c<120>>;

`mp_remove`

compiler	gcc-15	clang-20
before	0:00.32s - 91276K	0:00.69s - 156512K
after	0:00.31s - 90684K	0:00.67s - 157560K

template<class I, class L = mp_iota<I>> using test
  = mp_remove<L, I>;

using r1 = mp_transform<test, mp_iota_c<120>>;

`mp_sliding_fold`

compiler	gcc-15	clang-20
before	0:00.23s - 79084K	0:00.48s - 135100K
after	0:00.22s - 77748K	0:00.46s - 132436K

template<class I, class L = mp_iota<I>> using test
  = mp_list<
      mp_sliding_fold<L, mp_int<2>, mp_plus>,
      mp_sliding_fold<L, mp_int<2>, mp_max>
    >;

using r1 = mp_transform<test, mp_iota_c<50>>;

`mp_copy_if`

compiler	gcc-15	clang-20
before	0:00.21s - 65820K	0:00.51s - 138048K
after	0:00.21s - 65428K	0:00.49s - 138540K

template<class> using pred = mp_false;
template<class I, class L = mp_iota<I>> using test
  = mp_copy_if<L, pred>;

using r1 = mp_transform<test, mp_iota_c<120>>;

`mp_remove_if`

compiler	gcc-15	clang-20
before	0:00.21s - 67308K	0:00.51s - 137744K
after	0:00.21s - 66956K	0:00.49s - 138492K

template<class> using pred = mp_true;
template<class I, class L = mp_iota<I>> using test
  = mp_remove_if<L, pred>;

using r1 = mp_transform<test, mp_iota_c<120>>;

`mp_map_replace`

compiler	gcc-15	clang-20
before	0:00.39s - 114524K	0:00.66s - 155888K
after	0:00.38s - 114140K	0:00.64s - 155184K

template<class I, class L = mp_transform<mp_list, mp_iota<I>>> using test
  = mp_map_replace<L, mp_list<I,I>>;

using r1 = mp_transform<test, mp_iota_c<120>>;

`integer_sequence` with gcc

Use the __integer_pack builtin

compiler	gcc-15
before	0:00.14s - 47284K
after	0:00.07s - 37572K

template<class I> using test
  = make_integer_sequence<int, I::value>;

using r1 = mp_transform<test, mp_iota_c<200>>;

`integer_sequence` when no builtin is available

The probability that this code will be used is very low since msvc and clang-based compilers use __make_integer_seq and gcc uses __integer_pack. But it is shorter and eliminates the concatenation phase.

compiler	gcc-15	clang-20
before	0:00.14s - 47220K	0:00.23s - 107632K
after	0:00.10s - 40944K	0:00.15s - 95144K

template<class I> using test
  = make_integer_sequence<int, I::value>;

using r1 = mp_transform<test, mp_iota_c<200>>;

Nov 09 '25 18:11 jonathanpoelen

@pdimov A quick reminder for this PR.

Jan 06 '26 23:01 jonathanpoelen

Improve compilation speed by taking advantage of memoization

mp_transform_if

mp_filter

mp_drop

mp_sort

mp_replace_at

mp_power_set

mp_map_find

mp_map_update

mp_map_erase

mp_map_replace

mp_nth_element

mp_map_find_impl workaround (GCC)

mp_transform with 5 lists or more

mp_fill

mp_replace_if

mp_replace

mp_remove

mp_sliding_fold

mp_copy_if

mp_remove_if

mp_map_replace

integer_sequence with gcc

integer_sequence when no builtin is available

`mp_transform_if`

`mp_filter`

`mp_drop`

`mp_sort`

`mp_replace_at`

`mp_power_set`

`mp_map_find`

`mp_map_update`

`mp_map_erase`

`mp_map_replace`

`mp_nth_element`

`mp_map_find_impl` workaround (GCC)

`mp_transform` with 5 lists or more

`mp_fill`

`mp_replace_if`

`mp_replace`

`mp_remove`

`mp_sliding_fold`

`mp_copy_if`

`mp_remove_if`

`mp_map_replace`

`integer_sequence` with gcc

`integer_sequence` when no builtin is available