Shiyan Deng

Results 7 issues of Shiyan Deng

This was the second enhancement in https://github.com/NVIDIA/FasterTransformer/issues/98. Add support for different hidden dimension size in bert's fc layers. Currently in bert encoder feed forward part, the hidden dim of the...

Summary: X-link: https://github.com/pytorch/FBGEMM/pull/2310 ATT Differential Revision: D53364919

CLA Signed
fb-exported

Differential Revision: D53434630

CLA Signed
fb-exported

Summary: Allow autodeps to auto add `fbcode//torchrec/inference:batching` to the buck target when we imported `torchrec/inference/Batching.h` Reviewed By: houseroad Differential Revision: D54284062

CLA Signed
fb-exported

Summary: The only difference is that in the new function, we don't pass in a weight placement tensor. Instead we only pass in a weight placement variable since weight placement...

fb-exported
cla signed

### Suggestion Description Hi, for nvidia gpu we can use nvidia-smi to check ECC mode, and use the tool to turn ECC on/off. I couldn't find any documentation on AMD...

enhancement

## Purpose We have draft model trained in different format compare to the target model and need to use a different loader to load it. Adding the support for this....

speculative-decoding
ready
v1
llama