Rohan Varma
Rohan Varma
Thanks for raising this issue! I responded in PT: https://github.com/pytorch/pytorch/issues/82963. Although, not sure if HF uses nightlies/latest PT or a stable version. If we can't get pytorch updated in HF...
@pytorchbot merge
You can think of the generator and the discriminator as playing a game against each other in which they seek to get "better" (i.e., minimize their respective losses) at the...
Is there any progress on this issue? Happy to help in any way.
@edenlightning Sounds good, I also pinged the slack channel for any feedback/discussions.
The PR https://github.com/PyTorchLightning/pytorch-lightning/pull/5141 is ready for review, in case anyone wants to take a look.
This should be fixed in PyTorch nightly now: https://github.com/pytorch/pytorch/pull/83309
Is this with the default configs @kartikayk, or are you setting a higher batch size which could contribute to activation memory?
@BedirT Thanks for filing this issue! So yeah as @RdoubleA mentioned, please run the `tune download` command with the `--ignore-patterns` flag added (this is mentioned in the [config](https://github.com/pytorch/torchtune/blob/main/recipes/configs/llama3/70B_lora.yaml#L6) as well...
@kartikayk To clarify, are we considering removing kv cache entirely or refactoring the implementation to be less intrusive? We would want to keep some implementation of kv cache for efficient...