parmeet
parmeet
cc: @brianjo
During training, N needs to be greater than 1 because the model expects more than 1 value per channel. If you run the model in eval mode (`model.eval()`), it is...
In principle, conditions when freeze option is meaningful are: 1) The part of model is already pre-trained 2) The pre-trained part is used in conjunction with additional layers that require...
> In terms of the implementation details, I'm thinking about the behaviour of `freeze`, besides setting `requires_grad=False`, we need to be careful about `model.eval()` as it changes the behaviour of...
It's likely I am missing the context here, so just some clarifying questions: 1) Typically my understanding so far is we freeze the lower layers (closer to input) and train...
Thanks @nateanl and @mthrok for providing additional context and clarification. I think I wouldn't worry much as to where to place in the freeze options, and depending on the use...
Hi @mreso, i am not sure what should be the expected behavior when implementation is not available on GPU, but to me it's an undefined behavior. I wonder if we...
Thanks @brando90 for raising the issue. Yes, looks like the problem with Conda installation. could you please try with pip using version 0.8.1 (the supported version for 3.9, I will...
> > Thanks @brando90 for raising the issue. Yes, looks like the problem with Conda installation. could you please try with pip using version 0.8.1 (the supported version for 3.9,...
https://github.com/pytorch/text/pull/1874 resolves this.