bytestring icon indicating copy to clipboard operation
bytestring copied to clipboard

Revisit efficiency of `Integral` conversions in `stimes` implementation for `StrictByteString`

Open sjakobi opened this issue 3 years ago • 3 comments

The goal is to avoid allocating intermediate Integers when using stimes @Int, @Word or with other primitive types.

There's some ongoing work in GHC that should help us: https://gitlab.haskell.org/ghc/ghc/-/issues/20361. Once it is merged, we should check whether the goal has been achieved.

It may be useful to replace checkedIntegerToInt with Data.Bits.toIntegralSized then.

https://github.com/haskell/bytestring/blob/12563781f257f132f4da866047abe720623f9990/Data/ByteString/Internal.hs#L737-L787

https://github.com/haskell/bytestring/blob/12563781f257f132f4da866047abe720623f9990/Data/ByteString/Internal.hs#L868-L879

sjakobi avatar Mar 01 '22 00:03 sjakobi

Thank you. I think the part of the GHC Integer simplification changes that we need to avoid intermediate Integers in stimes @StrictByteString @Int without any rewrite rules was already merged via this upstream MR. It doesn't look like Word is quite so lucky just yet.

clyring avatar Mar 01 '22 01:03 clyring

As expected, with Int, the Core GHC-9.4.1-alpha1 generates looks great.

-- RHS size: {terms: 20, types: 9, coercions: 0, joins: 0/0}
testSpecInt :: Int -> S.ByteString -> S.ByteString
[GblId,
 Arity=2,
 Str=<1!P(SL)><1!P(L,L,L)>,
 Unf=Unf{Src=InlineStable, TopLvl=True, Value=True, ConLike=True,
         WorkFree=True, Expandable=True,
         Guidance=ALWAYS_IF(arity=2,unsat_ok=True,boring_ok=False)
         Tmpl= \ (nRaw_i21nd [Occ=Once1!] :: Int)
                 (bs_i21nf [Occ=Once1!] :: S.ByteString) ->
                 case bs_i21nf of
                 { Data.ByteString.Internal.BS ipv_i21nh [Occ=Once1]
                                               ipv1_i21ni [Occ=Once1] ipv2_i21nj [Occ=Once1] ->
                 case nRaw_i21nd of { GHC.Types.I# v1_B2 ->
                 case GHC.Prim.>=# v1_B2 0# of {
                   __DEFAULT -> Data.ByteString.Internal.stimesNegativeErr;
                   1# ->
                     Data.ByteString.Internal.$wstimesNonNegativeInt
                       v1_B2 ipv_i21nh ipv1_i21ni ipv2_i21nj
                 }
                 }
                 }}]
testSpecInt
  = \ (nRaw_i21nd :: Int) (bs_i21nf :: S.ByteString) ->
      case bs_i21nf of
      { Data.ByteString.Internal.BS ipv_i21nh ipv1_i21ni ipv2_i21nj ->
      case nRaw_i21nd of { GHC.Types.I# v1_B2 ->
      case GHC.Prim.>=# v1_B2 0# of {
        __DEFAULT -> Data.ByteString.Internal.stimesNegativeErr;
        1# ->
          Data.ByteString.Internal.$wstimesNonNegativeInt
            v1_B2 ipv_i21nh ipv1_i21ni ipv2_i21nj
      }
      }
      }

-- RHS size: {terms: 9, types: 5, coercions: 0, joins: 0/0}
testKnownInt :: S.ByteString -> S.ByteString
[GblId,
 Arity=1,
 Str=<1!P(L,L,L)>,
 Unf=Unf{Src=InlineStable, TopLvl=True, Value=True, ConLike=True,
         WorkFree=True, Expandable=True,
         Guidance=ALWAYS_IF(arity=1,unsat_ok=True,boring_ok=False)
         Tmpl= \ (bs_i21nf [Occ=Once1!] :: S.ByteString) ->
                 case bs_i21nf of
                 { Data.ByteString.Internal.BS ipv_i21nh [Occ=Once1]
                                               ipv1_i21ni [Occ=Once1] ipv2_i21nj [Occ=Once1] ->
                 Data.ByteString.Internal.$wstimesNonNegativeInt
                   123456# ipv_i21nh ipv1_i21ni ipv2_i21nj
                 }}]
testKnownInt
  = \ (bs_i21nf :: S.ByteString) ->
      case bs_i21nf of
      { Data.ByteString.Internal.BS ipv_i21nh ipv1_i21ni ipv2_i21nj ->
      Data.ByteString.Internal.$wstimesNonNegativeInt
        123456# ipv_i21nh ipv1_i21ni ipv2_i21nj
      }

For other types, things are not so great. Using Word without eta-expanding looks particularly ugly:

-- RHS size: {terms: 73, types: 27, coercions: 0, joins: 1/3}
testSpecWord :: Word -> S.ByteString -> S.ByteString
[GblId,
 Arity=1,
 Str=<ML>,
 Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True,
         WorkFree=True, Expandable=True, Guidance=IF_ARGS [20] 339 60}]
testSpecWord
  = \ (nRaw_i21nd :: Word) ->
      let {
        n_s21nP :: Integer
        [LclId]
        n_s21nP
          = case nRaw_i21nd of { GHC.Types.W# x#_a4Jg ->
            GHC.Num.Integer.integerFromWord# x#_a4Jg
            } } in
      let {
        lvl_s21o5 :: Int
        [LclId]
        lvl_s21o5
          = case GHC.Num.Integer.integerToInt# n_s21nP of v_B2 { __DEFAULT ->
            GHC.Types.I# v_B2
            } } in
      \ (bs_i21nf :: S.ByteString) ->
        case bs_i21nf of
        { Data.ByteString.Internal.BS ipv_i21nh ipv1_i21ni ipv2_i21nj ->
        case lvl_s21o5 of { GHC.Types.I# v1_B2 ->
        join {
          $j_s21nO [Dmd=ML] :: S.ByteString
          [LclId[JoinId(0)(Nothing)]]
          $j_s21nO
            = case n_s21nP of {
                GHC.Num.Integer.IS x1_i21nq ->
                  case GHC.Prim.<# x1_i21nq 0# of {
                    __DEFAULT ->
                      case ipv2_i21nj of {
                        __DEFAULT -> Data.ByteString.Internal.stimesOverflowErr;
                        0# -> S.empty
                      };
                    1# -> Data.ByteString.Internal.stimesNegativeErr
                  };
                GHC.Num.Integer.IP x1_i21nt ->
                  case ipv2_i21nj of {
                    __DEFAULT -> Data.ByteString.Internal.stimesOverflowErr;
                    0# -> S.empty
                  };
                GHC.Num.Integer.IN x1_i21nH ->
                  Data.ByteString.Internal.stimesNegativeErr
              } } in
        case n_s21nP of {
          GHC.Num.Integer.IS x1_i21nw ->
            case GHC.Prim.==# x1_i21nw v1_B2 of {
              __DEFAULT -> jump $j_s21nO;
              1# ->
                case GHC.Prim.>=# v1_B2 0# of {
                  __DEFAULT -> Data.ByteString.Internal.stimesNegativeErr;
                  1# ->
                    Data.ByteString.Internal.$wstimesNonNegativeInt
                      v1_B2 ipv_i21nh ipv1_i21ni ipv2_i21nj
                }
            };
          GHC.Num.Integer.IP x1_i21nC -> jump $j_s21nO;
          GHC.Num.Integer.IN x1_i21nE -> jump $j_s21nO
        }
        }
        }

-- RHS size: {terms: 2, types: 0, coercions: 0, joins: 0/0}
StimesSpecTest.testKnownWord1 :: Word
[GblId,
 Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True,
         WorkFree=True, Expandable=True, Guidance=IF_ARGS [] 10 10}]
StimesSpecTest.testKnownWord1 = GHC.Types.W# 456789##

-- RHS size: {terms: 2, types: 0, coercions: 0, joins: 0/0}
testKnownWord :: S.ByteString -> S.ByteString
[GblId,
 Unf=Unf{Src=<vanilla>, TopLvl=True, Value=False, ConLike=False,
         WorkFree=False, Expandable=False, Guidance=IF_ARGS [] 20 0}]
testKnownWord = testSpecWord StimesSpecTest.testKnownWord1

There are several problems:

  • The function actually has arity one, because it tries to share the intermediate Integer between calls. I don't think I noticed this when reducing the syntactic arity of stimesPolymorphic while working on #443, which is my bad. Although this can reduce allocations for something like map (stimes k) li, I think the trade-off is clearly unfavorable here:
    • When the function is called with arity two, allocations are actually increased because the closure has to be allocated.
    • This effectively prevents the Worker/Wrapper transformation from being applicable.
    • If the conversion to Integer actually is worth sharing for some use-case, a user can easily do so themselves. (Or, even better, convert to Int.)
  • GHC generates IN branches even though integerFromWord# cannot return any such value.
  • The Int# equality check in the IS branch is not optimized away, probably because integerToInt# is still opaque.
  • The join point $j_s21nO should really be inlined at its use sites.
  • The function body is large enough that it isn't inlined into testKnownWord, which prevents constant-folding from reducing the latter to something more like testKnownInt.

The first bullet should be fixed. Most of the other bullets are relatively easy for GHC to improve at, but would require ugly workarounds at the library level, so I think they warrant no action here.

clyring avatar May 18 '22 00:05 clyring

Thanks for investigating, @clyring! :)

I don't fully understand the tradeoffs of the arity issue, but I'd welcome a PR!

sjakobi avatar May 19 '22 16:05 sjakobi