YLGH
YLGH
@Asanoci we have an example of a retrieval model here, PTAL https://github.com/pytorch/torchrec/tree/main/examples/retrieval I'm not familiar with the bruteForce prediction model, could you help link it? I don't think we'd have...
@Asanoci, does this help resolve your issue? If so, I will close the issue
hey @jiannanWang thanks for reporting this issue, we'll double check on that. Just curious, is there any reason you're not using EmbeddingBag*Collection*Sharder? Table wise should be supported for it. This...
@jiannanWang I think you're diagnosis of problem, and proposed fix is correct. Feel free to make a PR!
cc @mrshenli @rohan-varma @colin2328 @divchenko @dstaay-fb @xing-liu @zhaojuanmao @wangkuiyi
> What happens if I don't specify overlapped optimizers for some parameters within ebc? it will allocate dense kernel for that table, and you get the grads (as potentially a...
hey @davidxiaozhi , not too familiar with horovod, but when using pure pytorch usually you need to call a dist.init_process_group(backend=backend), could you try explicitly calling this? Is this something that...
Closing for now - agree that this is a feature gap, but not a common one. Feel free to open a PR and we'll get it landed :)
Landed as part of composability and per parameter optimizers