Bingnan Wang
Bingnan Wang
.
.
I found that the multimodal-toolkit do not have embedding,it just use onehot encoding to deal with category,which leading to the too high dimension and caused memory error.If any one can...
Hello ekzhu,may I join your program?
## Description ## Checklist Please feel free to remove inapplicable items for your PR. - [ ] The PR title starts with [$CATEGORY] (such as [NN], [Model], [Doc], [Feature]]) -...
I found that when I used a very large model like robert-large,the implementation of gradients accumulation like this https://gist.github.com/innat/ba6740293e7b7b227829790686f2119c may be very expensive for the gpu memory because I need...
Hello,sir,I really appreciate your work! I am very curious about how to pretrain the tabtransformer? Is there an example?