Hassan Kibirige comments

Results 240 comments of


                                            Hassan Kibirige

Filtering using boolean indexes?

You can do it with `query`, but not in one expression ```python idx = df.logMessage.map(lambda x: bool(re.search('"success": true', x))) df = df >> query('@idx') ``` `filter` is a python key...

Filtering using boolean indexes?

There isn't a way to filter using a regex. > `df.query('col.str.contains("a")')` doesn't work. Yes it does not work. `query` is limited and will stay that way since it just passes...

Filtering using boolean indexes?

@antonio-yu, can you come up with a short specification/example of how you would expect regex filtering to work. Then we can start from there.

Filtering using boolean indexes?

> Frist,I wanna select these rows in which y contains the key word 'o' There are the [query_all](https://plydata.readthedocs.io/en/stable/generated/plydata.helper_verbs.query_all.html), [query_at](https://plydata.readthedocs.io/en/stable/generated/plydata.helper_verbs.query_at.html) and [query_if](https://plydata.readthedocs.io/en/stable/generated/plydata.helper_verbs.query_if.html) helpers, but I admit they are not easy to...

Filtering using boolean indexes?

I think it merits a second function. Now, what to call it `sift`, `sieve`, `query2`, ...?

Filtering using boolean indexes?

> But how to select columns that don't end with 'a'? You can use a regular expression ```python df >> select(matches='[^a]$') ```

Memory efficiency

There is already a way to use a context manager with the [options](https://plydata.readthedocs.io/en/stable/generated/plydata.options.options.html#plydata-options-options) ```python from plydata.options import options with options(modify_input_data=True): stocks.copy() >> mutate(timestamp='timestamp.apply(truncate_to_hour)') >> group_by('timestamp') >> summarize(high = 'max(high)', low...