What does this PR do?

Adds 'SwiftFormer' into huggingface/transformers

Fixes # (issue)

Before submitting

[ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
[x] Did you read the contributor guideline, Pull Request section?
[x] Was this discussed/approved via a Github issue or the forum? Please add a link to it if that's the case. https://github.com/huggingface/transformers/issues/22685
[x] Did you make sure to update the documentation with your changes? Here are the documentation guidelines, and here are tips on formatting docstrings.
[ ] Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag members/contributors who may be interested in your PR.

@amyeroberts @NielsRogge @alaradirik

Apr 10 '23 10:04 shehanmunasinghe

Are there plans for tensorflow version? I am interested in having the tf model available for use in a downstream task in my work/research

Apr 10 '23 22:04 D-Roberts

Hi @shehanmunasinghe, thanks for opening this PR!

Could you make sure to fill out all the necessary documentation for the model in the README and swiftformer.mdx file?

Quick question about the modeling code - it seems that all of the model components are copied from ViT i.e. their architecture and forward pass are exactly the same - is this correct?

@D-Roberts I don't know of anyone working on the TensorFlow version of this, looking through the open PRs or issues. @shehanmunasinghe - do you know of anyone who is working on a TF port?

Apr 13 '23 16:04 amyeroberts

Hi @amyeroberts, Thanks for your response.

Please note that this is a Work In Progress (WIP) pull request. The changes to the modeling code and the documentation will be reflected once I push them.

Hi @D-Roberts, currently I'm not aware of anyone working on the TensorFlow version of this.

Apr 14 '23 05:04 shehanmunasinghe

@shehanmunasinghe OK - sounds good :) Let us know when the PR is ready to review. In the meantime, please don't hesitate if there are any questions.

@D-Roberts - would you be interested in porting this model once the pytorch version is merged in?

Apr 14 '23 10:04 amyeroberts

@amyeroberts I am still working on porting the Efficientformer; I am interested in having both in tf to train in some downstream tasks / research... I would like to do the port for swiftformer too but can't commit to it right now due to time constraints (I do this in my spare time).. I'll revisit after I am done with the efficientformer and after the swiftformer torch pr here is merged too.

Apr 14 '23 14:04 D-Roberts

@D-Roberts Of course, no worries, and thank you for your work on adding EfficientFormer :) I've opened an issue - #22771 - to add the TF version of this model where future discussions on how, who and when can be organised.

Apr 14 '23 14:04 amyeroberts

The documentation is not available anymore as the PR was closed or merged.

Apr 14 '23 17:04 HuggingFaceDocBuilderDev

Hi @amyeroberts , this is now ready for your review.

Apr 15 '23 07:04 shehanmunasinghe

Hi @amyeroberts ,

There is one test case failing (examples_torch). This happened only after I merged recent changes from the main branch. Could you please help me identify what's causing this issue?

https://app.circleci.com/pipelines/github/huggingface/transformers/62802/workflows/2558056a-de51-44b6-9c22-8ba3b67d127a/jobs/774714?invite=true#step-111-2647

Apr 23 '23 16:04 shehanmunasinghe

Hi @amyeroberts , I have resolved the issues that were raised during the code review. Please take a look.

Apr 27 '23 19:04 shehanmunasinghe

@shehanmunasinghe Great! I'm away for a few days, but will re-review when I'm back at my computer at the start of next week.

Apr 28 '23 13:04 amyeroberts

Hi @amyeroberts, thanks for your time and effort in reviewing this. This is my first pull request on this repo and I'm glad to hear your constructive comments. I have applied the suggestions you made and updated the code again.

May 06 '23 07:05 shehanmunasinghe

Hi @amyeroberts , I have fixed those issues and pushed the updated code.

However, as indicated here one test is failing, though this has nothing to do with tests/models/whisper/test_modeling_whisper.py.

FAILED tests/models/whisper/test_modeling_whisper.py::WhisperModelTest::test_pt_tf_model_equivalence - AssertionError: 1.04904175e-05 not less than or equal to 1e-05 : outputs.encoder_hidden_states_0: Difference between PyTorch and TF is 1.049041748046875e-05 (>= 1e-05).

May 10 '23 05:05 shehanmunasinghe

Hi @amyeroberts , I have fixed those issues and pushed the updated code.

However, as indicated here one test is failing, though this has nothing to do with tests/models/whisper/test_modeling_whisper.py.

FAILED tests/models/whisper/test_modeling_whisper.py::WhisperModelTest::test_pt_tf_model_equivalence - AssertionError: 1.04904175e-05 not less than or equal to 1e-05 : outputs.encoder_hidden_states_0: Difference between PyTorch and TF is 1.049041748046875e-05 (>= 1e-05).

All checks are passing now.

May 11 '23 05:05 shehanmunasinghe

Hi @amyeroberts , thanks for approving this PR. I have updated everything and now I think it can be merged.

May 12 '23 05:05 shehanmunasinghe

Add swiftformer

What does this PR do?

Before submitting

Who can review?