Support YAML configs
🚀 Feature
Make classy_train.py work with YAML config files in addition to JSON.
Motivation
YAML is better than JSON, period.
Pitch
We already have experimental support for YAML by using the Hydra library (https://hydra.cc). We need to use it more, make sure it's stable and document it.
Alternatives
Support YAML without Hydra. There's lots of benefits of using Hydra though.
Additional context
https://hydra.cc https://github.com/facebookresearch/hydra
Has there been any progress on this? Even the support for hydra is undocumented and seems buggy or at least hard to get working.
Hi @benjamindkilleen - we indeed have undocumented support for Hydra available. What's missing is a tutorial for using Hydra. If you have hydra installed in your environment, we use our hydra based entry point, and otherwise we just use the regular JSON based config args. https://github.com/facebookresearch/ClassyVision/pull/536 has some examples about hydra usage.
Even the support for hydra is undocumented and seems buggy
Regarding the buggy part, are you running into any issues?
Yes, I had some issues when I installed, due to "buggy extension," but I initially did not install from source I re-installed from source and things are working more smoothly, although a classy-vision set of configs would probably help zero-to-sixty time for new users. (I had to spend a fair bit of time reading the hydra docs, which are very good but take time to translate into the context of classy vision.
Got it. We do bundle a few hydra based configs which should provide an idea about the structure. Do the examples in the summary of https://github.com/facebookresearch/ClassyVision/pull/536 help?
The primary reason we have this issue open is because of a lack of documentation.
The configs in classy_vision/hydra/conf were helpful, yes, in setting up my own library of configs, although they would be tricky to find for someone getting started. From a new user's perspective, I would almost rather classy_vision required hydra-core as a dependency and started with multiple folders of pre-configured yaml files for different datasets/models/losses etc right off the bat, without the JSON option at all. As soon as I got my own hydra conf files set up, I got rid of the json directory. Given Hydra's extensive features for running experiments, such as multi-run, any user who understood it would be crazy not to use it.
If there are ongoing documentation efforts for classy_vision + hydra, I would suggest they include examples for multi-run experiments, such as hyperparameter search or using the same overall multi-domain dataset with different domains as the test set (my particular use-case). Happy to contribute also.
Got it, thanks for the feedback! While hydra is definitely a better option than JSON, our plan is to allow a slow transition. I agree though that currently the support is hidden. Maybe we can think of a better way of enabling hydra (or making it the default).
Regarding examples of hydra specific features like hyper param search, I agree this would be really helpful and it would be great to have these examples in a hydra tutorial!