easydata
easydata copied to clipboard
A flexible template for doing reproducible data science in Python.
For example, if I have 3 transformations in my `catalog/transformation_list.json`, make data will run all 3 of them in order every time. Even if I've only changed one of my...
Copied from my work in progress post (easier than re-writing it): Have you ever run someone else's notebook only to get stymied by a variable name that isn't defined? This...
Replace the awkward way that we add transformer functions to make them available via `workflow.available_transformers()`. Use a module and namespace instead.
HI when i use : sudo cookiecutter https://github.com/hackalog/cookiecutter-easydata.git --checkout bus_number show me the following error: File "/tmp/tmp1KbtQa.py", line 13 src_path = config_dir / f'{template_name}.json' SyntaxError: invalid syntax ERROR: Stopping generation...
Listing all the raw files in a datasource would greatly help the Makefile do the right thing if one of them changes (or is missing)
Would be nice to have a helper function when making transformers that merge data that helps to combine the metadata of the datasets that are used.
all the "make data" style commands are old in the notebooks.
You should be able to set up a dataset without having to run through one or reference the tutorial notebooks in the bus_number branch.
Check that your download is not empty when running fetch() from add_url(). Issue a warning or error.
I ran into this when trying to download kaggle data. It would "successfully" download for the url, but would result in an empty directory.