eland icon indicating copy to clipboard operation
eland copied to clipboard

Implement `ed_df.groupby([...]).indices`

Open V1NAY8 opened this issue 5 years ago • 2 comments

  • Implement .indices similar to pandas Example:
>>> pd_ecommerce.groupby(["currency","type"]).indices 
{('EUR', 'order'): array([   0,    1,    2, ..., 4672, 4673, 4674], dtype=int64)}
>>> pd_flights.groupby(["Cancelled"]).indices
{False: array([    0,     1,     2, ..., 13056, 13057, 13058], dtype=int64), True: array([    3,     8,    12, ..., 13037, 13048, 13051], dtype=int64)}     

How this works in Pandas: We return a dictionary where values are the positions of that document in the entire index.

V1NAY8 avatar Nov 15 '20 12:11 V1NAY8

@sethmlarson Is there a way we can fetch the _id which is the position of document while using composite aggregation? Our main goal is to find the positions where the buckets are in a dataframe.

V1NAY8 avatar Nov 16 '20 15:11 V1NAY8

Not sure, might not be possible unfortunately.

sethmlarson avatar Nov 16 '20 15:11 sethmlarson