Add swiftformer
What does this PR do?
Adds 'SwiftFormer' into huggingface/transformers
Fixes # (issue)
Before submitting
- [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
- [x] Did you read the contributor guideline, Pull Request section?
- [x] Was this discussed/approved via a Github issue or the forum? Please add a link to it if that's the case. https://github.com/huggingface/transformers/issues/22685
- [x] Did you make sure to update the documentation with your changes? Here are the documentation guidelines, and here are tips on formatting docstrings.
- [ ] Did you write any new necessary tests?
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag members/contributors who may be interested in your PR.
@amyeroberts @NielsRogge @alaradirik
Are there plans for tensorflow version? I am interested in having the tf model available for use in a downstream task in my work/research
Hi @shehanmunasinghe, thanks for opening this PR!
Could you make sure to fill out all the necessary documentation for the model in the README and swiftformer.mdx file?
Quick question about the modeling code - it seems that all of the model components are copied from ViT i.e. their architecture and forward pass are exactly the same - is this correct?
@D-Roberts I don't know of anyone working on the TensorFlow version of this, looking through the open PRs or issues. @shehanmunasinghe - do you know of anyone who is working on a TF port?
Hi @amyeroberts, Thanks for your response.
Please note that this is a Work In Progress (WIP) pull request. The changes to the modeling code and the documentation will be reflected once I push them.
Hi @D-Roberts, currently I'm not aware of anyone working on the TensorFlow version of this.
@shehanmunasinghe OK - sounds good :) Let us know when the PR is ready to review. In the meantime, please don't hesitate if there are any questions.
@D-Roberts - would you be interested in porting this model once the pytorch version is merged in?
@amyeroberts I am still working on porting the Efficientformer; I am interested in having both in tf to train in some downstream tasks / research... I would like to do the port for swiftformer too but can't commit to it right now due to time constraints (I do this in my spare time).. I'll revisit after I am done with the efficientformer and after the swiftformer torch pr here is merged too.
@D-Roberts Of course, no worries, and thank you for your work on adding EfficientFormer :) I've opened an issue - #22771 - to add the TF version of this model where future discussions on how, who and when can be organised.
The documentation is not available anymore as the PR was closed or merged.
Hi @amyeroberts , this is now ready for your review.
Hi @amyeroberts ,
There is one test case failing (examples_torch). This happened only after I merged recent changes from the main branch. Could you please help me identify what's causing this issue?
https://app.circleci.com/pipelines/github/huggingface/transformers/62802/workflows/2558056a-de51-44b6-9c22-8ba3b67d127a/jobs/774714?invite=true#step-111-2647
Hi @amyeroberts , I have resolved the issues that were raised during the code review. Please take a look.
@shehanmunasinghe Great! I'm away for a few days, but will re-review when I'm back at my computer at the start of next week.
Hi @amyeroberts, thanks for your time and effort in reviewing this. This is my first pull request on this repo and I'm glad to hear your constructive comments. I have applied the suggestions you made and updated the code again.
Hi @amyeroberts , I have fixed those issues and pushed the updated code.
However, as indicated here one test is failing, though this has nothing to do with tests/models/whisper/test_modeling_whisper.py.
FAILED tests/models/whisper/test_modeling_whisper.py::WhisperModelTest::test_pt_tf_model_equivalence - AssertionError: 1.04904175e-05 not less than or equal to 1e-05 : outputs.encoder_hidden_states_0: Difference between PyTorch and TF is 1.049041748046875e-05 (>= 1e-05).
Hi @amyeroberts , I have fixed those issues and pushed the updated code.
However, as indicated here one test is failing, though this has nothing to do with
tests/models/whisper/test_modeling_whisper.py.
FAILED tests/models/whisper/test_modeling_whisper.py::WhisperModelTest::test_pt_tf_model_equivalence - AssertionError: 1.04904175e-05 not less than or equal to 1e-05 : outputs.encoder_hidden_states_0: Difference between PyTorch and TF is 1.049041748046875e-05 (>= 1e-05).
All checks are passing now.
Hi @amyeroberts , thanks for approving this PR. I have updated everything and now I think it can be merged.