transformers icon indicating copy to clipboard operation
transformers copied to clipboard

Add swiftformer

Open shehanmunasinghe opened this issue 2 years ago • 11 comments

What does this PR do?

Adds 'SwiftFormer' into huggingface/transformers

Fixes # (issue)

Before submitting

  • [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • [x] Did you read the contributor guideline, Pull Request section?
  • [x] Was this discussed/approved via a Github issue or the forum? Please add a link to it if that's the case. https://github.com/huggingface/transformers/issues/22685
  • [x] Did you make sure to update the documentation with your changes? Here are the documentation guidelines, and here are tips on formatting docstrings.
  • [ ] Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag members/contributors who may be interested in your PR.

@amyeroberts @NielsRogge @alaradirik

shehanmunasinghe avatar Apr 10 '23 10:04 shehanmunasinghe

Are there plans for tensorflow version? I am interested in having the tf model available for use in a downstream task in my work/research

D-Roberts avatar Apr 10 '23 22:04 D-Roberts

Hi @shehanmunasinghe, thanks for opening this PR!

Could you make sure to fill out all the necessary documentation for the model in the README and swiftformer.mdx file?

Quick question about the modeling code - it seems that all of the model components are copied from ViT i.e. their architecture and forward pass are exactly the same - is this correct?

@D-Roberts I don't know of anyone working on the TensorFlow version of this, looking through the open PRs or issues. @shehanmunasinghe - do you know of anyone who is working on a TF port?

amyeroberts avatar Apr 13 '23 16:04 amyeroberts

Hi @amyeroberts, Thanks for your response.

Please note that this is a Work In Progress (WIP) pull request. The changes to the modeling code and the documentation will be reflected once I push them.

Hi @D-Roberts, currently I'm not aware of anyone working on the TensorFlow version of this.

shehanmunasinghe avatar Apr 14 '23 05:04 shehanmunasinghe

@shehanmunasinghe OK - sounds good :) Let us know when the PR is ready to review. In the meantime, please don't hesitate if there are any questions.

@D-Roberts - would you be interested in porting this model once the pytorch version is merged in?

amyeroberts avatar Apr 14 '23 10:04 amyeroberts

@amyeroberts I am still working on porting the Efficientformer; I am interested in having both in tf to train in some downstream tasks / research... I would like to do the port for swiftformer too but can't commit to it right now due to time constraints (I do this in my spare time).. I'll revisit after I am done with the efficientformer and after the swiftformer torch pr here is merged too.

D-Roberts avatar Apr 14 '23 14:04 D-Roberts

@D-Roberts Of course, no worries, and thank you for your work on adding EfficientFormer :) I've opened an issue - #22771 - to add the TF version of this model where future discussions on how, who and when can be organised.

amyeroberts avatar Apr 14 '23 14:04 amyeroberts

The documentation is not available anymore as the PR was closed or merged.

Hi @amyeroberts , this is now ready for your review.

shehanmunasinghe avatar Apr 15 '23 07:04 shehanmunasinghe

Hi @amyeroberts ,

There is one test case failing (examples_torch). This happened only after I merged recent changes from the main branch. Could you please help me identify what's causing this issue?

https://app.circleci.com/pipelines/github/huggingface/transformers/62802/workflows/2558056a-de51-44b6-9c22-8ba3b67d127a/jobs/774714?invite=true#step-111-2647

shehanmunasinghe avatar Apr 23 '23 16:04 shehanmunasinghe

Hi @amyeroberts , I have resolved the issues that were raised during the code review. Please take a look.

shehanmunasinghe avatar Apr 27 '23 19:04 shehanmunasinghe

@shehanmunasinghe Great! I'm away for a few days, but will re-review when I'm back at my computer at the start of next week.

amyeroberts avatar Apr 28 '23 13:04 amyeroberts

Hi @amyeroberts, thanks for your time and effort in reviewing this. This is my first pull request on this repo and I'm glad to hear your constructive comments. I have applied the suggestions you made and updated the code again.

shehanmunasinghe avatar May 06 '23 07:05 shehanmunasinghe

Hi @amyeroberts , I have fixed those issues and pushed the updated code.

However, as indicated here one test is failing, though this has nothing to do with tests/models/whisper/test_modeling_whisper.py.

FAILED tests/models/whisper/test_modeling_whisper.py::WhisperModelTest::test_pt_tf_model_equivalence - AssertionError: 1.04904175e-05 not less than or equal to 1e-05 : outputs.encoder_hidden_states_0: Difference between PyTorch and TF is 1.049041748046875e-05 (>= 1e-05).

shehanmunasinghe avatar May 10 '23 05:05 shehanmunasinghe

Hi @amyeroberts , I have fixed those issues and pushed the updated code.

However, as indicated here one test is failing, though this has nothing to do with tests/models/whisper/test_modeling_whisper.py.

FAILED tests/models/whisper/test_modeling_whisper.py::WhisperModelTest::test_pt_tf_model_equivalence - AssertionError: 1.04904175e-05 not less than or equal to 1e-05 : outputs.encoder_hidden_states_0: Difference between PyTorch and TF is 1.049041748046875e-05 (>= 1e-05).

All checks are passing now.

shehanmunasinghe avatar May 11 '23 05:05 shehanmunasinghe

Hi @amyeroberts , thanks for approving this PR. I have updated everything and now I think it can be merged.

shehanmunasinghe avatar May 12 '23 05:05 shehanmunasinghe