Add Seaformer model
What does this PR do?
Fixes #21668
Seaformer is a two-branch architecture with Squeeze enhanced Axial Transformer for semantic segmentation on mobile devices.
Supersedes #21774
Before submitting
- [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
- [x] Did you read the contributor guideline, Pull Request section?
- [x] Was this discussed/approved via a Github issue or the forum? #21668
- [x] Did you make sure to update the documentation with your changes? Here are the documentation guidelines, and here are tips on formatting docstrings.
- [ ] Did you write any new necessary tests?
Who can review?
@alaradirik thanks for offering help with this PR, please let me know about any changes required.
Hi @inderpreetsingh01, thank you! You can ping me once the PR is ready is to be reviewed.
You can follow the official guidelines to learn how to prepare the configuration, image processor and modeling files to replicate the original work such that forward propagating an image through the HF and original implementation yields the same results.
What does this PR do?
Fixes #21668 Seaformer is a two-branch architecture with Squeeze enhanced Axial Transformer for semantic segmentation on mobile devices. Supersedes #21774
Before submitting
- [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
- [x] Did you read the contributor guideline, Pull Request section?
- [x] Was this discussed/approved via a Github issue or the forum? Add SeaFormer model #21668
- [x] Did you make sure to update the documentation with your changes? Here are the documentation guidelines, and here are tips on formatting docstrings.
- [ ] Did you write any new necessary tests?
Who can review?
@alaradirik thanks for offering help with this PR, please let me know about any changes required.
The PR is just initialized using SegFormer, I can do a review once the SeaFormer model is implemented.
Hi @alaradirik, I have added seaformer implementation in modeling file and updated the conversion and configuration scripts, I have ran a forward pass in notebook and output is same as the original seaformer model. Can you please review it and let me know of any changes required? I am yet to do the testing part.
Hi @alaradirik thanks for the detailed review :) I have uploaded the converted model to the hub here Inderpreet01/seaformer-semantic-segmentation-large, will work on your comments and update the pr. Thanks
Hi @alaradirik thanks for the detailed review :) I have uploaded the converted model to the hub here Inderpreet01/seaformer-semantic-segmentation-large, will work on your comments and update the pr. Thanks
Thank you! Feel free to ping me when you'd like me to do the final review
Hi @alaradirik I have worked on the changes you mentioned, two tests are failing in test_modeling_seaformer.py
SeaformerModelTest::test_initialization - AssertionError: -6.169999778649071e-06 not found in [0.0, 1.0] I have normally initialized the parameters so negative values are expected.
SeaformerModelTest::test_config - ValueError: The following keys were not properly set in the config: label2id and id2label are having 150 items but it is expecting 1 item in test_configuration_common.py config_common_kwargs dictionary is having id2label and label2id key dictionary with one item as value.
Can you please help me with them thanks.
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.
Also I have worked on the checks and most of them are successful, will need your help with the remaining three checks. thanks.
Also I have worked on the checks and most of them are successful, will need your help with the remaining three checks. thanks.
Hi @inderpreetsingh01, I'll be taking a look shortly!
Hi @inderpreetsingh01, I took a look at the code and failed tests and saw that some of the failures are due to unrelated models. Could you rebase to main by clicking on the Synch fork button on your branch?
The modeling test failure stemming from the label mapping is probably just due to setting a num_labels attribute within SeaformerConfig. All config classes inherit from the PretrainedConfig class, which computes the num_labels based on the id2label and label2id attributes, which are initialized to have 2 labels by default. You should remove the num_labels attribute and overwrite the default id2label and label2id attributes within the conversion script. You can take a look at the configuration, conversion and test scripts of MaskFormer and Mask2Former to see how that's done.
Hope this helps!
Hi @alaradirik, thanks for your response, removing num_labels from config has resolved that testcase, can you please help with this test case as well
SeaformerModelTest::test_initialization - AssertionError: -6.169999778649071e-06 not found in [0.0, 1.0]
I have normally initialized the parameters so negative values are expected.
I have looked at maskformer and segformer but not able to figure this out.
actually this test is getting skipped in segformer model which also initializes weights normally.
actually this test is getting skipped in segformer model which also initializes weights normally.
Hi @inderpreetsingh01, sorry for my late reply, I was off due to moving. You can overwrite the test by creating a test with the same name - test_initialization - as the weight initialization is inline with the original model. You can take a look at common test functions defined over here to see what this test does.
Hi @alaradirik thanks for reply, where should i create this test with the same name?
Hi @alaradirik can you please do the final review? thanks
@inderpreetsingh01 Thanks for adding this model! Ping me when the PR is ready for review (once all of @alaradirik's comments have been addressed and tests are passing).
@alaradirik thanks for the review, @amyeroberts sure will ping you once model is ready
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.