podium icon indicating copy to clipboard operation
podium copied to clipboard

Podium: a framework agnostic Python NLP library for data loading and preprocessing

Results 15 podium issues
Sort by recently updated
recently updated
newest added

We need a way to automatically test examples to ensure they work with framework core changes. One solution would be marking these tests can be marked with `slow` to avoid...

Closes #273 Draft, function calls and names subject to change, but the gist is here.

Discussed this change with @mttk on Slack. Features: * lazy module loading * makes it possible to import DiskBackedDataset, HFDatasetConverter and YAKE from the top level `__init__.py` but only if...

Currently, `get_dataset_splits()` in our datasets is a static method (`@staticmethod`), but it would be more appropriate to have it marked as a class method (`@classmethod`). The following example shows the...

dataset

## 🐛 Bug This is a serious bug. If `ExampleFactory` is instantiated with fields in the dict format, calling `from_list` will throw an error. This line in `ExampleFactory.from_list` is the...

bug

At some point, we could implement this (and the creation of `data`) using views. Now is not the time though. Some performance metrics would be interesting to compare between this...

Refactors `sort`, `shuffle` and `filter` in DatasetBase/Dataset.

dataset

ArrowDataset.from_tabular_file is similar to TabularDataset's `__init__`. I think it's safe to remove this function. The same effect can be achieved with: ```python ArrowDataset.from_dataset(TabularDataset(...), ...) ``` E.g. #267 introduced some changes...

dataset