dataenforce
dataenforce copied to clipboard
Python package to enforce column names & data types of pandas DataFrames
Would be nice to have a feature to define a Dataset using a dataclass as a source. So, instead of ``` DUser = Dataset["id": int, "name": str] def process1(data: DUser):...
``` --------------------------------------------------------------------------- TypeError Traceback (most recent call last) in /opt/Anaconda3/envs/basic_ml/lib/python3.8/site-packages/dataenforce/__init__.py in wrapper(*args, **kwargs) 45 dtypes = dict(value.dtypes) 46 for colname, dt in hint.dtypes.items(): ---> 47 if not np.issubdtype(dtypes[colname], np.dtype(dt)): 48...
The library is not compatible with pylance 3.9
--which would be rather useful. Cool project. Thanks.
Is it possible to add shape information? e.g. `Dataset[(10, 3), ("a": int, "b": int, "c": int)]` Wil type a dataframe with 10 rows and 3 columns
I saw that your package has a `validate` decorator to ensure the data frame during run time, Is there a way for it to integrate with `mypy` for static code...
A NewType definition will fail the `inspect.isclass` check. Retrieving and using the `__supertype__` will potentially fix this. Example. ```python from typing import NewType, Dict UserId = NewType("UserId", int) DUserDataset =...
Support for dtype checking alone would be useful. e.g. ``` Dataset[int, str, str] ```