Niels comments

Results 50 comments of


                                            Niels

nannyml can confuse months with days on some rows of a dataset

Hey, it seems like the date format in the data was a bit too exotic to be parsed correctly. If I manually parse the timestamps during preprocessing like `data['date'] =...

Add bootstrapping options to chunk methods

Hey Noah, thanks for the pointer! How do you envision the bootstrapping to fit within the library? As a replacement for the chunking, as something to perform on each chunk...

Add bootstrapping options to chunk methods

Ah yes, I see. Instead of just splitting a dataset into non-overlapping chunks (like you've mentioned) you would take that dataset as a whole and generate chunks of a fixed...

Add bootstrapping options to chunk methods

Hey Noah, just letting you know that I've created a quick implementation of the `BootstrapChunker` for you, you can find it in the [117-bootstrap-chunker](https://github.com/NannyML/nannyml/tree/117-bootstrap-chunker) branch. It uses the `pandas.DataFrame.sample()` method...

Add bootstrapping options to chunk methods

Original data reconstruction error with `SizeBasedChunker(size=5000)`: ![image](https://user-images.githubusercontent.com/94110348/193574494-937d2d93-19ea-44ce-97b9-0c35e7af4a6c.png) Same data and calculator, `BootstrapChunker(chunk_count=10, n=5000) ` ![image](https://user-images.githubusercontent.com/94110348/193574688-c1fe50e6-e31c-4a98-bb59-f8f566b8f712.png) `BootstrapChunker(chunk_count=20, chunk_count=10000)` ![image](https://user-images.githubusercontent.com/94110348/193574825-da710807-8fa8-40ca-83e2-8d3f261cfa40.png) `BootstrapChunker(chunk_count=30, chunk_count=10000)` ![image](https://user-images.githubusercontent.com/94110348/193574915-4e806d7d-86ba-4cb3-a335-f910a7dc2ebd.png)

Error with the code when ran with chunk number = 9

Hey Kishan, sorry for being somewhat slow to reply. ### What went wrong This exception is due to the "chunking" of the given datasets. The size of the combined reference...

Pandas data type 'string' not understood

Thanks for the extra information all. We'll try to include a fix in the next release, coming next week!

Pandas data type 'string' not understood

> Looks like the required numpy version (`>=1.14.0`) might be wrong. It seems like it should be closer to `>=1.21.0` (at least based on the current nannyml code). > >...

bare-bones functions for CBPE

Hi Andrew, these functions do exist within the `nannyml.performance_estimation.confidence_based.cbpe` module, but they were kept protected up until now. I don't see any issues with making these public, people will just...

Automatic Binning estimation for ECE and Brier Score Metric

Update: we've received a great PR (#172) by @Jebq that incorporates the `numpy.historgram_bin_edges` functionality.