Bingnan Wang issues

Results 9 issues of


                                            Bingnan Wang

add dselect moe and fix tf version bug

to be solved

No category embedding?

I found that the multimodal-toolkit do not have embedding,it just use onehot encoding to deal with category,which leading to the too high dimension and caused memory error.If any one can...

Add pytorch-lightning tutorial with notebook

## Description ## Checklist Please feel free to remove inapplicable items for your PR. - [ ] The PR title starts with [$CATEGORY] (such as [NN], [Model], [Doc], [Feature]]) -...

Why can we accumulate the gradients like torch? Just cumsum the training loss?

I found that when I used a very large model like robert-large,the implementation of gradients accumulation like this https://gist.github.com/innat/ba6740293e7b7b227829790686f2119c may be very expensive for the gpu memory because I need...

how to plot by batches?

## ❓ Questions/Help/Support

question

How to pretrain?

Hello,sir,I really appreciate your work! I am very curious about how to pretrain the tabtransformer? Is there an example?

Bingnan Wang