Add IRIS
What does this PR do?
This PR adds Iris, a Reinforcement learning agent for Sample Efficient RL
Fixes #30882
Before submitting
- [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
- [x] Did you read the contributor guideline, Pull Request section?
- [x] Was this discussed/approved via a Github issue or the forum? Please add a link to it if that's the case.
- [x] Did you make sure to update the documentation with your changes? Here are the documentation guidelines, and here are tips on formatting docstrings.
- [x] Did you write any new necessary tests?
Who can review?
@amyeroberts @younesbelkada @NielsRogge @kashif
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag members/contributors who may be interested in your PR.
Ready for review!!!
cc @amyeroberts It will be really helpful if you could please take a look at failing tests in in CI. As the PR is ready Thank you
@RUFFY-369 Regarding the CI tests:
- For check_code_quality, you'll need to run
make fixuplocally and push the changes. Make sure you have the library-compatible versions of the formatting libraries by runningpip install -e .[quality]first - For the
check_repo_consistencycheck, this is happening because thetorchvisionimport isn't protected byis_torchvision_available.
Regarding the PR itself, thanks for opening this PR and contributing to the library! Looking at the modeling file, I think the best way to share this implementation would directly on the hub, here's a guide: https://huggingface.co/docs/transformers/custom_models. This is the recommended and fastest way to add models to transformers and we have as much support as possible there.
The main reason I say this is that the structure and nature of the model breaks a lot of patterns that we expect in transformers. For example, step methods and handling of agents, envs, rewards etc. Perhaps of interest to @Cadene @edbeeching
@amyeroberts , thank you for the feedback. As for make fixup, from the beginning i am getting this error(even after doing pip install -e .[quality] again):
make: /bin/sh: Argument list too long
make: *** [Makefile:12: modified_only_fixup] Error 127
The torchvision error is now fixed with your help,thanks
As for the main reason that you mentioned, I think there are just two Reinforcement learning models available in the transformers so, I thought that it would be great to have an addition of it for the library. As per the usage of the model, I prepared the model file as it can be used how Decsion Transformer is used with a model.pretrained() where rest of the stuff could be done in the training script by the user?! That's why I just hop on putting time to make this model possible for transformers.
Also, is this reason generally for reinforcement learning models or just this one?
Thanks
As for the main reason that you mentioned, I think there are just two Reinforcement learning models available in the transformers so, I thought that it would be great to have an addition of it for the library. As per the usage of the model, I prepared the model file as it can be used how Decsion Transformer is used with a model.pretrained() where rest of the stuff could be done in the training script by the user?! That's why I just hop on putting time to make this model possible for transformers.
@RUFFY-369 Ah, I see, thanks for pointing out! I wasn't aware of decision transformer in the library. In this case, these patterns are OK.
i am getting this error(even after doing pip install -e .[quality] again):
Could you try running make style first ? This sometimes works.
If not, you can:
- Take the list of modified files
- Run:
ruff check $(modified_py_files) --fix --exclude examples/research_projects - Run:
ruff format $(modified_py_files) --exclude examples/research_projects
@RUFFY-369 Ah, I see, thanks for pointing out! I wasn't aware of decision transformer in the library. In this case, these patterns are OK.
@amyeroberts Yeah, transformers already have just two reinforcement learning models available, other one is Trajectory transformer and both were added in 2022. That's why I hopped on to add to this list to add to the community 'cause as compared to other types, RL models numbers are low. The 'OK' was assuring as i put some time in the commits :), thanks.
Could you try running
make stylefirst ? This sometimes works.If not, you can:
- Take the list of modified files
- Run:
ruff check $(modified_py_files) --fix --exclude examples/research_projects- Run:
ruff format $(modified_py_files) --exclude examples/research_projects
Okay, let me try your second solution as I have pushed all the commits following the guide to adding a model so, I have already done make style and make quality. I will update with the same. Thanks
@amyeroberts All the checks have passed successfully
Hi @amyeroberts , I have done the suggested changes and some refactoring in the modeling file to enhance compatibility. Please review the updated files at your convenience. Thank you!
As this is a huge piece of work and and it represents only the third model that integrates Reinforcement Learning (RL) with transformers. If this model is successfully ported, it will significantly benefit the transformers + RL community. As this is SOTA in sample efficient RL for methods without lookahead search in the Atari 100k benchmark so,its successful integration with transformers will provide considerable value for fine-tuning it or training it from scratch on various tasks with Transformers Trainer. Moreover, achieving full compatibility and successful porting will serve as a blueprint for future RL models based on the transformers architecture to be added in the library even for the papers' authors.
Following the successful porting, I will create and hyperlink a Colab notebook with a detailed guide on using the model with transformers, including training it from scratch and will be more than happy to maintaining the model here. :+1:
How did you convert the model file into hugginface format? Usually, transformers require conversion script also.
https://github.com/eloialonso/iris_pretrained_models/tree/main/pretrained_models
Maybe upload few more models would be good for other people :)
How did you convert the model file into hugginface format? Usually, transformers require conversion script also.
https://github.com/eloialonso/iris_pretrained_models/tree/main/pretrained_models
@SangbumChoi Thank you for your comment. Basically i converted the initial model weights file of iris-breakout with the conversion script to make it compatible with Hugging face separately. The conversion script was experimental and was also quite simple so had it in a separate folder. I will push that too in this branch as it worked fine.
Maybe upload few more models would be good for other people :)
I just pushed all the converted models to the hub. You can check them out here
soft cc @amyeroberts , thank you
soft ping @amyeroberts PS: I have created a colab notebook as well to perform training using hugging face trainer class with the ported model in the same way as original code does :+1:
Hi @amyeroberts , here is the link to the colab notebook containing step by step explained guide to train IrisModel with Hugging face trainer. This can also be referred by future users of the model for demo purposes.
Repository link: https://github.com/RUFFY-369/Transformers-tutorials/blob/main/IRIS/train_Iris_agent_on_atari.ipynb
Also @SangbumChoi , please feel free to try the notebook as well.
Cheers!
Thanks for working on adding this model @RUFFY-369 and apologies for the delay in reviewing.
I think this model should be added as a model on the hub. It breaks too many typical transformers patterns and, as such is hard to review and incorporate into the library.
I've given a first-pass review highlighting things to make it more transformers-like - however this there are to help with the code in-general and won't be sufficient in making it ready for merge. Make sure the code follows standard python formatting would help a lot in reading and understanding here. Please make sure to use clear, explicit variable names (no unnecessary acronyms or single letter variables) as stick with standard transformer pattern e.g. no device management inside configs.
Hi @amyeroberts , thank you for your review. And all the comments regarding making it more transformers compatible so that people can use it on hub.
Just had three questions:
- Are there any more changes which needs to be done to put the model on hub?
- Secondly, what will be of this PR and colab notebook for using this model that I attached above?
- Can any of this type of model be added to the transformers for example this
Rest, I will make the changes that are necessary and put the model on hub so that it can be for ease of access for the users
Are there any more changes which needs to be done to put the model on hub?
None of the comments in the PR review are necessary for putting the model on the hub - you can add the code immediately if you want. They just provide guidelines on making the code more transformers-like such that it can be understood and more likely to be compatible with any future features.
Secondly, what will be of this PR and colab notebook for using this model that I attached above?
This PR we would close. For the notebook, I would update to use the model from the hub, and then link to the notebook on the checkpoint's model page for reference so people can see your good work!
Can any of this type of model be added to the transformers for example this
Whether a model is integrated into transformers isn't so much about the "type" of model, but factors such as:
- How easy it is to integrate into the library
- Whether the model offers new functionalities or capabilities to the library
- Potential impact of the model for the community e.g. is it highly requested?
We don't have many RL models, but that doesn't mean it's impossible. One of the most important things transformers provides is a (relatively) standardised API for use of models and clean, easy to read modeling code. For example, we emphasis separate modeling files over modularity of layers. In the case of this model, the PR is so complex in its current state, it means it takes a very long time to review and it deviates from transformers' patterns both for code and usage. Adding a new model requires a lot of resources - both in terms of time for the maintainers for reviewing and in terms of on-going maintenance (tests, updates etc), and so we have to be selective in terms of the models we integrate. This is why we've worked on enabling people to add models on the hub - as it enables more people to add more models without the need to go through the selective (and slow) stage of model PRs within the transformers repo.