eland
eland copied to clipboard
Implement `ed_df.groupby([...]).indices`
- Implement
.indicessimilar to pandas Example:
>>> pd_ecommerce.groupby(["currency","type"]).indices
{('EUR', 'order'): array([ 0, 1, 2, ..., 4672, 4673, 4674], dtype=int64)}
>>> pd_flights.groupby(["Cancelled"]).indices
{False: array([ 0, 1, 2, ..., 13056, 13057, 13058], dtype=int64), True: array([ 3, 8, 12, ..., 13037, 13048, 13051], dtype=int64)}
How this works in Pandas: We return a dictionary where values are the positions of that document in the entire index.
@sethmlarson
Is there a way we can fetch the _id which is the position of document while using composite aggregation?
Our main goal is to find the positions where the buckets are in a dataframe.
Not sure, might not be possible unfortunately.