Add the stream generation functions as well to the Data.Stream module
We did not include these earlier thinking that we would directly use unfolds. But on second thoughts I am thinking that we should only recommend using the stream API in most common cases and reach for unfolds only when nesting performance is desired. That will keep things simpler, and require minimal use of unfolds. See the unfold documentation updates in https://github.com/composewell/streamly/pull/1755 .
We can possibly implement all of these just in terms of unfolds i.e. x = Stream.unfold Unfold.x.
Possibly, we should prefer using direct implementations instead of unfolds until we fix the potential perf issues. Here is the perf comparison with (using the original stream implementations) and without (using unfolds in the new Data.Stream module) the use-prelude flag:
$ bin/bench-runner --targets Data.Stream --cabal-build-options "--flag use-prelude"
$ bin/bench-runner --targets Data.Stream --append
Data.Stream(cpuTime)
Benchmark default(0)(μs) default(1) - default(0)(%)
-------------------------------------------------------------------------- -------------- --------------------------
All.Data.Stream/o-1-space.mixedX4.filter-scan 80.29 +10022.91
All.Data.Stream/o-1-space.hoisting.generally 47.20 +9054.00
All.Data.Stream/o-1-space.elimination.length . generally 47.31 +9024.70
All.Data.Stream/o-1-space.mixedX4.scan-map 119.74 +6995.77
All.Data.Stream/o-1-space.mixedX4.take-scan 204.95 +5333.28
All.Data.Stream/o-1-space.mixedX4.drop-scan 431.84 +2120.44
All.Data.Stream/o-1-space.exceptions/serial.retryUnknown 1236.86 +177.85
All.Data.Stream/o-1-space.exceptions/serial.retryNoneSimple 1708.44 +126.99
All.Data.Stream/o-1-space.multi-stream-pure.eqBy 99.50 +126.57
All.Data.Stream/o-1-space.multi-stream-pure./= 99.47 +125.08
All.Data.Stream/o-1-space.multi-stream-pure.== 99.87 +124.30
All.Data.Stream/o-1-space.multi-stream-pure.< 111.84 +119.52
All.Data.Stream/o-1-space.multi-stream-pure.cmpBy 112.03 +106.80
All.Data.Stream/o-1-space.concat.concatMapRepl (sqrt n of sqrt n) 858.13 +72.29
All.Data.Stream/o-n-heap.buffered.reverse' 171.85 +65.65
All.Data.Stream/o-1-space.concat.concatMapWith (sqrt n of sqrt n) 3372.38 +63.86
All.Data.Stream/o-1-space.concat.concatMapWith (1 of n) 6680.00 +61.15
All.Data.Stream/o-1-space.mapping.foldrS 3130.24 +59.84
All.Data.Stream/o-1-space.mapping.foldrSMap 3218.47 +56.54
All.Data.Stream/o-n-heap.buffered.joinLeftMap 30347.50 +52.22
All.Data.Stream/o-1-space.mapping.foldrT 3844.15 +47.65
All.Data.Stream/o-n-heap.buffered.joinInnerMap 34305.20 +46.50
All.Data.Stream/o-1-space.concat.concatMapM (n of 1) 2222.53 +43.81
All.Data.Stream/o-1-space.concat.concatMap (n of 1) 2260.67 +42.65
All.Data.Stream/o-1-space.mapping.foldrTMap 4039.91 +42.60
All.Data.Stream/o-1-space.concat.concatMapWith (n of 1) 13404.10 +42.53
All.Data.Stream/o-1-space.elimination.uncons 1019.36 +37.51
All.Data.Stream/o-1-space.exceptions/serial.retryNone 1595.59 +36.48
All.Data.Stream/o-1-space.concat.concatMapPure (n of 1) 3379.96 +33.86
All.Data.Stream/o-1-space.concat.concatMap (1 of n) 1491.40 +33.01
All.Data.Stream/o-1-space.foldable.min (ord) 1063.01 +27.77
All.Data.Stream/o-n-space.concat.concatPairWithSerial (2 of n/2) 5024.80 +25.99
All.Data.Stream/o-1-space.concat.concatMapPure (sqrt n of sqrt n) 1148.65 +25.18
All.Data.Stream/o-1-space.elimination.build.Identity.foldrMToListLength 844.64 +23.69
All.Data.Stream/o-1-space.concat.concatMapM (sqrt n of sqrt n) 820.72 +23.07
All.Data.Stream/o-1-space.insertingX4.intersperse 3511.85 +22.93
All.Data.Stream/o-n-space.concat.concatPairWithSerial (sqrtVal of sqrtVal) 8048.72 +22.58
All.Data.Stream/o-1-space.foldable.or 47.43 +20.98
All.Data.Stream/o-n-space.Monad.(>>) (n times) 59757.10 +20.63
All.Data.Stream/o-1-space.joining.serial (2,x/2) 2862.17 +18.73
All.Data.Stream/o-n-space.foldl.maximumBy 16309.40 +18.41
All.Data.Stream/o-1-space.concat.concatMap (sqrt n of sqrt n) 787.69 +17.31
All.Data.Stream/o-1-space.filteringX4.takeWhileM-true 48.23 +15.97
All.Data.Stream/o-n-space.Applicative.liftA2 (n times) 73481.90 +15.36
All.Data.Stream/o-1-space.multi-stream.eqBy 99.41 +12.72
All.Data.Stream/o-n-space.foldr.foldrM/reduce/Identity (sum) 1439.85 +12.40
All.Data.Stream/o-1-space.concat.concatMapPure (1 of n) 2421.80 +12.25
All.Data.Stream/o-1-space.elimination.build.Identity.foldrToStreamLength 1003.02 +10.58
All.Data.Stream/o-1-space.Monad.(>>=) (sqrt n x sqrt n) (filterSome) 5506.83 +10.40
All.Data.Stream/o-1-space.concat.concatMapM (1 of n) 1708.86 +10.36
All.Data.Stream/o-1-space.joining.serial (2,2,x/4) 4153.33 +10.23
All.Data.Stream/o-n-space.concat.concatPairWithSerial (n of 1) 55187.80 +8.84
Data.Stream(Allocated)
Benchmark default(0)(KiB) default(1) - default(0)(%)
-------------------------------------------------------------------------- --------------- --------------------------
All.Data.Stream/o-1-space.exceptions/serial.retryUnknown 7809.37 +179.93
All.Data.Stream/o-1-space.exceptions/serial.retryNoneSimple 11710.86 +120.03
All.Data.Stream/o-1-space.concat.concatMapRepl (sqrt n of sqrt n) 5485.85 +71.36
All.Data.Stream/o-1-space.mapping.foldrS 16377.54 +47.57
All.Data.Stream/o-n-heap.buffered.joinLeftMap 62411.39 +43.44
All.Data.Stream/o-n-heap.buffered.joinInnerMap 63440.14 +42.43
All.Data.Stream/o-1-space.mapping.foldrSMap 18742.49 +41.50
All.Data.Stream/o-1-space.exceptions/serial.retryNone 9371.65 +41.49
All.Data.Stream/o-1-space.concat.concatMapM (sqrt n of sqrt n) 3925.56 +39.94
All.Data.Stream/o-1-space.concat.concatMap (sqrt n of sqrt n) 3925.56 +39.94
All.Data.Stream/o-1-space.concat.concatMap (1 of n) 7809.39 +39.87
All.Data.Stream/o-1-space.concat.concatMapM (1 of n) 7809.39 +39.86
All.Data.Stream/o-n-heap.buffered.reverse 3888.73 +39.28
All.Data.Stream/o-1-space.concat.concatMapWith (1 of n) 48387.72 +32.22
All.Data.Stream/o-1-space.concat.concatMapWith (sqrt n of sqrt n) 24466.23 +32.01
All.Data.Stream/o-1-space.mapping.foldrT 28098.12 +30.53
All.Data.Stream/o-1-space.mapping.foldrTMap 30420.11 +28.31
All.Data.Stream/o-1-space.foldable.min (ord) 6239.98 +25.15
All.Data.Stream/o-1-space.concat.concatMapWith (n of 1) 89061.44 +22.65
All.Data.Stream/o-1-space.insertingX4.intersperse 25753.97 +21.25
All.Data.Stream/o-1-space.concat.concatMapPure (sqrt n of sqrt n) 7858.93 +20.02
All.Data.Stream/o-1-space.elimination.uncons 7809.38 +20.00
All.Data.Stream/o-1-space.concat.concatMapPure (1 of n) 15618.76 +20.00
All.Data.Stream/o-n-heap.buffered.intersectBy (sqrtVal) 39.77 +19.67
All.Data.Stream/o-n-space.concat.concatPairWithSerial (2 of n/2) 35519.40 +18.66
All.Data.Stream/o-1-space.concat.concatMapM (n of 1) 9371.25 +16.55
All.Data.Stream/o-1-space.concat.concatMap (n of 1) 9371.25 +16.55
All.Data.Stream/o-n-space.foldl.minimumBy 13340.69 +11.89
All.Data.Stream/o-n-space.foldl.maximumBy 13382.39 +11.36
All.Data.Stream/o-n-space.concat.concatPairWithSerial (sqrtVal of sqrtVal) 77839.88 +9.49
All.Data.Stream/o-n-heap.buffered.joinInner (sqrtVal) 18853.88 +8.33
All.Data.Stream/o-1-space.concat.concatMapPure (n of 1) 18726.62 +8.17
All.Data.Stream/o-1-space.joining.serial (2,x/2) 17573.03 +6.62
All.Data.Stream/o-n-space.concat.concatPairWithSerial (n of 1) 323916.44 +5.97
Need to do this comparison again after fixing this issue.
I added the stream generation functions instead of using unfolds, but we still have the regressions:
Data.Stream(cpuTime)
Benchmark default(0)(μs) default(1) - default(0)(%)
-------------------------------------------------------------------------- -------------- --------------------------
All.Data.Stream/o-1-space.mixedX4.filter-scan 80.43 +9984.80
All.Data.Stream/o-1-space.hoisting.generally 49.82 +8250.66
All.Data.Stream/o-1-space.elimination.length . generally 49.83 +8137.05
All.Data.Stream/o-1-space.mixedX4.scan-map 119.94 +6813.65
All.Data.Stream/o-1-space.mixedX4.take-scan 204.99 +4993.00
All.Data.Stream/o-1-space.mixedX4.drop-scan 471.85 +1963.71
All.Data.Stream/o-1-space.mapping.foldrS 3091.60 +54.96
All.Data.Stream/o-1-space.concat.concatMapWith (sqrt n of sqrt n) 3326.12 +51.09
All.Data.Stream/o-1-space.concat.concatMapWith (1 of n) 6846.97 +50.36
All.Data.Stream/o-n-heap.buffered.joinLeftMap 30943.30 +50.26
All.Data.Stream/o-1-space.mapping.foldrTMap 3880.27 +47.73
All.Data.Stream/o-n-heap.buffered.joinInnerMap 35123.10 +46.59
All.Data.Stream/o-1-space.mapping.foldrSMap 3264.14 +44.18
All.Data.Stream/o-1-space.mapping.foldrT 4016.93 +38.90
All.Data.Stream/o-n-heap.buffered.reverse' 169.36 +35.30
All.Data.Stream/o-1-space.concat.concatMapWith (n of 1) 13319.60 +34.10
All.Data.Stream/o-n-heap.transformer.StateT Int IO (n times) (baseline) 8897.75 +28.84
All.Data.Stream/o-n-space.concat.concatPairWithSerial (n of 1) 58082.70 +22.69
All.Data.Stream/o-n-space.concat.concatPairWithSerial (2 of n/2) 5089.46 +17.36
All.Data.Stream/o-n-space.concat.concatPairWithSerial (sqrtVal of sqrtVal) 7929.69 +15.12
All.Data.Stream/o-1-space.filteringX4.uniq 373.69 +13.94
All.Data.Stream/o-1-space.elimination.length . IsList.toList 74.50 +13.74
All.Data.Stream/o-1-space.Monad.(>>) (sqrt n x sqrt n) 40.28 +9.58
All.Data.Stream/o-1-space.concat.concatMap (sqrt n of sqrt n) 784.13 +8.40