Create python package, add type hints and structured types for dataset
Hey there, thanks for you amazing work!
I've added a few things to your repo which I'd like to offer as a contribution:
- Reorganized the source code into a python package that's easy to install
- Added some type hints to a bunch of functions
- Created a typed dict "schema" for the data so that the contents of the datasets can be easily understood directly in code
- Simplified the argument parsing logic using dataclasses and simple-parsing
Let me know if you have any questions. Thanks again!
Hi Fabrice, thanks for the contributions, which all look really helpful! However, it seems you have also changed the folder structure of the repo by adding an abcd directory. Would it be possible to include your changes, but have the main structure remain the same? The updates themselves (such as adding better type hints) are great!
Hey @dchen-asapp , thanks for taking a look.
Its not possible to create a python package without the folder structure changing slightly (at least for the code).
Whats worrying about the folder structure changing, if you dont mind me asking? Are you worried it might break some of your data pipelines or automation scripts or something?
I'm mostly afraid that the main training script will break since these things are often path dependent.