Release 0.12.2 is broken due to #4480 and #4513
🐛 Bug
https://github.com/facebookresearch/fairseq/pull/4480 was added Jun 16th, 2022, but then reverted (by https://github.com/facebookresearch/fairseq/commit/956fcf495b2d5d696ba114520363f82148a8a649) on Jun 22nd.
However it seems the 0.12.2 release was done between this (despite the release date saying Jun 27th).
To Reproduce
E.g. pip install fairseq, then go to where the files have been installed and look under fairseq/modules/transformer_layer.py and the __init__() function of TransformerEncoderLayerBase contains lots of references to BT.
Also models/transformer/transformer_layer.py has load under forward_scriptable().
(Aside: shouldn't this kind of version detection code be done globally, and added to cfg, rather than in the forward() function that is going to get called a lot?)
Additional context
The revert commit describes how this was causing a test failure. (So I think you need to check your release procedures are running tests on the release candidate branch!)
I think it is also the cause of #4518? And also the cause of #4740?
But beyond that, the code in transformer_layer.py was also buggy: when loading a big transformer model (created with a version of fairseq prior to this) that buggy code added a number of variables to every encoder layer, which added 75 million float32 weights to the saved model. It has taken me a day and a half of troubleshooting to narrow it back to this "BT" code.
(BTW, it was also not mentioned anywhere that "BT" means the BetterTransformer code added to PyTorch 1.12.)
I would suggest 0.12.3 be released based on the intended Jun 27th version of main?
Aha! It was re-landed on Jun 24th (see https://github.com/facebookresearch/fairseq/pull/4513) and then backed out a second time on Jul 14th. So that explains how it ended up in the 0.12.2 release.
Hi @DarrenCook You are right and loading 0.12.2 checkpoints with 0.12.1 is also buggy/risky. A 0.12.3 release seems to be indeed the right action. Kind
Using pip install fairseq==0.12.1 has been working for me so far.
I had to throw away the models made with 0.12.2: the extra weights added to the encoder seem to be part of the model that was training, i.e. it has created a different architecture. Which also means supporting models built with 0.12.2 in 0.12.3 and later is not going to work.
@frank-wei can you investigate please ? @dianaml0 @cbalioglu do we need to cut a new release to prevent people working on the broken 0.12.2 ?
Yeah, we should have a 0.12.3 release and mark 0.12.2 as deprecated. Let me sync up internally.
Yes, the code has some problems(bring more variables and takes more memory) and we decided to back it out later https://github.com/facebookresearch/fairseq/pull/4568
@cbalioglu any news on 0.12.3 ?
Things like mms have been added to the repo after the latest release, would it be possible to have a new release that includes the mms things ? I'm asking as I would like to package this for nixos. Using an officially released version rather than latest master would be great. Thank you for the software!
Hey all, any news on a new release? We are currently unable to use this repo because we have a dependency on numpy > 1.24, which requires a fix for the deprecated np.float alias, which seems to have been taken care of here: https://github.com/ncoish/fairseq/commit/a24fdf2d1b36e699d8f4e3efd33b7b78d6a02e7e