Kyle Gilde

Results 10 issues of Kyle Gilde

The following code doesn't work. Thank you! ```python @pandas_udf('string') def as_set(x): return str(set(x)) spark.udf.register('as_set', as_set) kdf = ks.DataFrame( {'a': [1, 2, 2, 4, 5, 6], 'b': ["one", "one", "one", "two",...

bug

Hello! Would you consider adding the ability to self-reference DataFrames and Series using the lambda function inside of the .loc[] method? I think it's one of the most convenient features...

enhancement

This code needs unit tests, but I wanted to get feedback on this newer implementation of the class. closes stale PR #359

Implements the DateTimeSubtraction class. I don't know if the _more_tags method is needed. I copied and pasted it from CombineWithReferenceFeature.

Is there any chance that you would support the ability to subtract datetime columns?

new transformer
good first issue
easy

## Expected Behavior In CatBoost, the highly-cardinal categorical features can be encoded in two ways using the has_time parameter: 1. w/ random permutation (has_time=False) 2. w/o random permutation (has_time=True) https://catboost.ai/en/docs/concepts/parameter-tuning#internal-dataset-order...

enhancement
help wanted

Problem: I just finished reading [this blob post](https://towardsdatascience.com/tutorial-uncertainty-estimation-with-catboost-255805ff217e) and was wondering the following: How do we use the knowledge & data uncertainty to create a distribution and calculate a confidence...

### Missing functionality I find that staring at a correlation matrix or heatmap to be tedious. It contains the uninformative diagonal and all of the values and variable pairs are...

feature request 💬

I think it would make a lot of sense to add a "type" argument to this step with the options of "centered" & "lagging". Ideally, I would really like to...

feature