database-stream-processor icon indicating copy to clipboard operation
database-stream-processor copied to clipboard

Expand support for map streams

Open Kixiron opened this issue 3 years ago • 9 comments

Things like .recursive() don't have support for OrdIndexedZSet which makes working with maps a lot harder

Kixiron avatar Jul 11 '22 17:07 Kixiron

Joins also don't support producing maps

Kixiron avatar Jul 11 '22 18:07 Kixiron

I was just thinking about the same. At the moment you have to use index or index_with to create and IndexedZSet. Changing the API this way is easy. The only drawback is that even if the output is a Z-set (non-indexed) the join function will have to return a 2-tuple with the unit tuple () for the value.

ryzhyk avatar Jul 11 '22 18:07 ryzhyk

For recursive the extra complication is that distinct is only defined for Z-sets and not for indexed Z-sets.

ryzhyk avatar Jul 11 '22 18:07 ryzhyk

if you want groups to be sets - which is a reasonable choice - you could define distinct as mapping the standard distinct over all groups.

mihaibudiu avatar Jul 11 '22 18:07 mihaibudiu

Oh, and also we won't be able to use OrdZSet as the default return type. It'd have to be OrdZSet for sets and OrdIndexedZSet for maps, which probably means we need two different methods.

ryzhyk avatar Jul 11 '22 18:07 ryzhyk

if you want groups to be sets - which is a reasonable choice - you could define distinct as mapping the standard distinct over all groups.

Yes, but we'd need a separate distinct operator for that, which again means we need two implementations of recursive.

ryzhyk avatar Jul 11 '22 18:07 ryzhyk

Could there be a trait "distinctable"? But maybe that doesn't help if the trait has to specify the output type.

mihaibudiu avatar Jul 11 '22 18:07 mihaibudiu

Trouble is, Rust does not support specialization, so it would be impossible to have two implementations of that trait, one of which works for indexed Z-sets and the other for Z-sets (since Z-sets are a special case of indexed Z-sets). There are also no negative trait bounds, so you can't say "use this impl only if Val != ()".

ryzhyk avatar Jul 11 '22 18:07 ryzhyk

BTW, in DD most operators output "sets". Indexed collections are created by separate operators that combine DBSP's index and trace in one call (not that we have to replicate their design).

ryzhyk avatar Jul 11 '22 18:07 ryzhyk