Jinbin Fu
Jinbin Fu
def dice_loss_func(input, target): smooth = 1. n = input.size(0) iflat = input.view(n, -1) tflat = target.view(n, -1) intersection = (iflat * tflat).sum(1) loss = 1 - ((2. * intersection +...
When I use this model on custom dataset, it is normal in training phase,but once in evaluation phase, it's always encountering out of memory for GPU. What's the possible reason...
### System Info / 系統信息 版本及硬件按照指示安装 ### Who can help? / 谁可以帮助到您? @1049451037 ### Information / 问题信息 - [X] The official example scripts / 官方的示例脚本 - [ ] My own...
### Feature request / 功能建议 how to convert huggingface model convert to SAT model? ### Motivation / 动机 So can use sat to train V2 faster. ### Your contribution /...
transformer.word_embeddings 在代码中的功能是计算最开始将token id转成embedding,最后输出计算token的feature相似度 lm_head呢?没找到具体的使用位置