Benjamin Lefaudeux

Results 91 comments of Benjamin Lefaudeux

>> During validation, each worker sees a variable number of examples. This is okay in itself, but it is problematic (hang) if it results in any worker having extra batches....

> > > During validation, each worker sees a variable number of examples. This is okay in itself, but it is problematic (hang) if it results in any worker having...

> Hi @MatthieuTPHR - this looks like a great improvement! > > > Would it be possible to add a more optimised kernel for head-dim=40 which is the parameter used...

> I've tried running the code in this PR, but I'm getting the following error: > > ``` > AttributeError: module 'triton.language' has no attribute 'constexpr' > ``` > >...

> Does it require a GPU with tensor cores (RTX 20 Series and above) ? getting : `WARNING:root:Blocksparse is not available: the current GPU does not expose Tensor cores` >...

> @blefaudeux I'm using linux in wsl2, the problem might be related to the version of torch and torchvision really sorry about that.. are you able to use conda there...

> Tested on GTX 1070ti : Without Memory efficient cross attention at 512x512 : **1.78 it/s** > > With Memory efficient cross attention at 512x512 : **2.34 it/s** > >...

FYI installing xformers should be easier now on linux platforms (especially with colab), just `pip install xformers` should give you this attention mechanism (the kernels come pre-built) as of a...

cc @patrickvonplaten, not what we discussed but this is an effective three liner

> Hey @blefaudeux, > > > > How to you use this feature I think it's only used in decoding if `"force_not_quantize"` is set to `True` no? It's [in the...