easydata
easydata copied to clipboard
A flexible template for doing reproducible data science in Python.
Similar question to the one here: https://github.com/drivendata/cookiecutter-data-science/issues/158 What are your thoughts on integration with `dvc`?
The tutorial gives the impression that it would be possible to create a notebook that both defines the data source (and integrates it in the workflow) and demonstrate its usage....
Move `parse_function` from part of DataSource to somethign added to a transformer. That is, `add_transformer(from_datasource="lvq-pak", parse_function=my_function)`
Add comment to the notebook in the "nuke it from orbit" section about needing to run in the terminal if you're running in jupyterhub
- [ ] Suggest doing `git init` from the terminal (in fact, all of the terminal commands from the terminal) - [ ] Exercise 6: be explicit about needing to...
when calling add_file, check the filename/hash of the file against the current file list before adding. (via mark)
and make data should go data/processed -> data/processed
Remove Annie, add the "who is this for" chart to the bus_number README.md (in the bus_number repo). Use the slides at the beginning of the data section.