containers icon indicating copy to clipboard operation
containers copied to clipboard

Settle on a consistent strictness for `fromList` and `fromAscList`

Open gereeter opened this issue 8 years ago • 8 comments

(Originally described in https://github.com/haskell/containers/pull/340#issuecomment-247811863)

In both Map and IntMap, the strict versions of fromList and fromAscList show inconsistent and somewhat nonsensical behaviour:

Function Result on [(0, ⊥), (0, 0)] Result on [(0, 0), (0, ⊥), (0, 0)]
fromList
fromAscList [(0, 0)]
Prelude Data.IntMap.Strict> fromList [(0, undefined), (0, 0)]
fromList *** Exception: Prelude.undefined
CallStack (from HasCallStack):
  error, called at libraries/base/GHC/Err.hs:79:14 in base:GHC.Err
  undefined, called at <interactive>:2:15 in interactive:Ghci1
Prelude Data.IntMap.Strict> fromAscList [(0, undefined), (0, 0)]
fromList [(0,0)]
Prelude Data.IntMap.Strict> fromList [(0, 0), (0, undefined), (0, 0)]
fromList *** Exception: Prelude.undefined
CallStack (from HasCallStack):
  error, called at libraries/base/GHC/Err.hs:79:14 in base:GHC.Err
  undefined, called at <interactive>:4:23 in interactive:Ghci2
Prelude Data.IntMap.Strict> fromAscList [(0, 0), (0, undefined), (0, 0)]
fromList *** Exception: Prelude.undefined
CallStack (from HasCallStack):
  error, called at libraries/base/GHC/Err.hs:79:14 in base:GHC.Err
  undefined, called at <interactive>:5:26 in interactive:Ghci2

Map and IntMap show the same behaviour as each other.

gereeter avatar Jan 03 '18 19:01 gereeter

The result of fromList is fine, and should stay as it is. We probably can't do anything other than that without a potentially large performance penalty.

I think we obviously want the two example arguments to fromAscList to give the same result as each other, but which one? The easiest thing is to force each element immediately, which would give it the same strictness as fromList. The alternative is to find the last element with each key before forcing. I don't have a very strong opinion about which way is most reasonable.

treeowl avatar Jan 03 '18 19:01 treeowl

I think it makes the most sense for fromAscList to force everything and match fromList. It is simple to describe and reason about and it allows replacing fromAscList with fromList for safety without having to worry about bottom values.

gereeter avatar Jan 03 '18 20:01 gereeter

I think it makes the most sense for fromAscList to force everything and match fromList. It is simple to describe and reason about and it allows replacing fromAscList with fromList for safety without having to worry about bottom values.

@treeowl, @int-e, what do you think about this idea?

sjakobi avatar Aug 04 '20 03:08 sjakobi

Yeah, sure.

treeowl avatar Aug 04 '20 04:08 treeowl

Note that fromAscList is implemented in terms of fromAscListWithKey,

-- abridged from Data.Map.Strict.Internal
fromAscList :: Eq k => [(k,a)] -> Map k a
fromAscList xs  = fromAscListWithKey (\_ x _ -> x) xs

fromAscListWithKey :: Eq k => (k -> a -> a -> a) -> [(k,a)] -> Map k a

and that exhibits the same discrepancy:

fromListWithKey    (\_ x _ -> x) [(0,undefined),(0,())] = _|_
fromAscListWithKey (\_ x _ -> x) [(0,undefined),(0,())] = fromList [(0,())]

We should not make it strict in all given values though because

fromListWithKey    (\_ _ x -> x) [(0,()),(0,undefined),(0,undefined)] = fromList [(0,())]
fromAscListWithKey (\_ _ x -> x) [(0,()),(0,undefined),(0,undefined)] = fromList [(0,())]

looks useful enough for people to rely on (I would rely on the former (at least for performance; imagine expensive computations instead of the bottoms), and if, by accident, the keys turn out to be sorted, I would switch to fromAscList and expect it to work).

On the other hand it's consistent with fromListWithKey to make fromAscListWithKey strict in the first associated value for each key. What's the cost of doing that? The easy approach (forcing the value in the combineEq helper) will force the value twice in the (common) case of a unique key, once here, and once when constructing the map in fromDistinctAscList.

(The last time I thought about this I gave up around here; this fromAscList[With[Key]] laziness corner case is odd, but pretty harmless, and fixing it (with benchmarking to assess the impact, and potentially trying several approaches) didn't seem worh the effort.)

int-e avatar Aug 04 '20 08:08 int-e

The test added in 9765edd in fact fails for containers 0.6.2.1 (distributed with GHC 8.10.4). It wasn't immediately obvious to me which commit "fixed" it.

infinity0 avatar Apr 17 '21 16:04 infinity0

@infinity0: I see only one failing test and that was added in b64a103e428cb57bfe1837ca5fcdaf2d5efa4f8a, testing for extra thunks left behind by Data.IntMap.Lazy.fromAscList (the test group says IntMap.Strict, that's a typo). #658 is the relevant pull request that fixed it. If you see more than one failing strictness test case, that may happen due to lack of optimization when compiling containers.

int-e avatar Apr 18 '21 00:04 int-e

@int-e Thanks for the explanation, yes I only see that one failing test also. :)

infinity0 avatar Apr 19 '21 19:04 infinity0