FlowKit icon indicating copy to clipboard operation
FlowKit copied to clipboard

Expose inflows/outflows?

Open jc-harrison opened this issue 3 years ago • 4 comments

Inflows and outflows are not exposed through FlowAPI. While these can be calculated outside FlowKit by summing the results of a flows query, this approach will lead to inaccuracies because some of the results may be redacted - it would be preferable to calculate the inflows/outflows before redaction.

This is particularly an issue for labelled_flows - in that case, all counts for a location pair are redacted if any count for that pair is too small, so some of the redacted values may be large, leading to significant errors in post-redaction calculation of inflows/outflows.

jc-harrison avatar Feb 04 '22 17:02 jc-harrison

It's worth noting that inflows and outflows are already implemented in flowmachine, in the InFlow and OutFlow classes (or, equivalently, the inflow and outflow methods of a Flows query); they're just not exposed through the API. But we may want to think about the nuances of the implementation - InFlow and OutFlow sum OD elements from/to all locations, including the diagonal (i.e. counts of subscribers who stayed in the same location) and flows from/to null locations (which could be counts of inactive or unlocatable subscribers). It may be more useful to sum only off-diagonal elements, with an option to exclude counts from/to null locations.

jc-harrison avatar Apr 07 '22 13:04 jc-harrison

Will we want to expand this for labelled_flows in this PR, or should we hold off?

Thingus avatar Apr 08 '22 14:04 Thingus

Will we want to expand this for labelled_flows in this PR, or should we hold off?

I'd consider this issue to cover labelled_flows as well as flows, but fine to handle those in separate PRs (with flows having higher priority).

jc-harrison avatar Apr 08 '22 16:04 jc-harrison

But we may want to think about the nuances of the implementation - InFlow and OutFlow sum OD elements from/to all locations, including the diagonal (i.e. counts of subscribers who stayed in the same location) and flows from/to null locations (which could be counts of inactive or unlocatable subscribers). It may be more useful to sum only off-diagonal elements, with an option to exclude counts from/to null locations.

This is now covered by issue https://github.com/Flowminder/FlowKit/issues/5128

jc-harrison avatar May 09 '22 17:05 jc-harrison