Hassan Kibirige
Hassan Kibirige
You can do it with `query`, but not in one expression ```python idx = df.logMessage.map(lambda x: bool(re.search('"success": true', x))) df = df >> query('@idx') ``` `filter` is a python key...
There isn't a way to filter using a regex. > `df.query('col.str.contains("a")')` doesn't work. Yes it does not work. `query` is limited and will stay that way since it just passes...
@antonio-yu, can you come up with a short specification/example of how you would expect regex filtering to work. Then we can start from there.
> Frist,I wanna select these rows in which y contains the key word 'o' There are the [query_all](https://plydata.readthedocs.io/en/stable/generated/plydata.helper_verbs.query_all.html), [query_at](https://plydata.readthedocs.io/en/stable/generated/plydata.helper_verbs.query_at.html) and [query_if](https://plydata.readthedocs.io/en/stable/generated/plydata.helper_verbs.query_if.html) helpers, but I admit they are not easy to...
I think it merits a second function. Now, what to call it `sift`, `sieve`, `query2`, ...?
> But how to select columns that don't end with 'a'? You can use a regular expression ```python df >> select(matches='[^a]$') ```
There is already a way to use a context manager with the [options](https://plydata.readthedocs.io/en/stable/generated/plydata.options.options.html#plydata-options-options) ```python from plydata.options import options with options(modify_input_data=True): stocks.copy() >> mutate(timestamp='timestamp.apply(truncate_to_hour)') >> group_by('timestamp') >> summarize(high = 'max(high)', low...
The `ply()` method will not replace the `>>` operator, it will be an alternative albeit with better performance. You have to wrap multi-line statements in parens, and when I do...
[Benchmarks](https://github.com/mm-mansour/Fast-Pandas)
`ply` method (part 2 of the issue) has been added. Need to think about part 1.