Overfit
Overfit
I try to run the code on a small dataset and I find that pred_loss decrease fast while avg_acc stay at 50%. It is strange to me since decrease in...
I think it's reasonable to tie the input and output embedding. Especially the output embedding along each token. But I still can't get a way to do this. Any one...
I find memory usage continues to increase when I use this package. Will 'Wrapping ffmpeg/avconv inside a subprocess to reduce memory overhead' be helpful? And how to do that?
**Environment:** - Python version 3.7 - Spark version 2.4 - TensorFlow version 2.5 - TensorFlowOnSpark version 2.2.3 - Cluster version hadoop **Describe the bug:** I found the evaluator node won't...
从demo来看,得到的是整个文档的主题分布,如果想要得到每个句子的主题分布呢?
I ran the code and found that some of the regression losses aren't consistent. But the final detection result is good. I wonder what's the problem 
It seems that your implementation to inspect loss landscape is to train multiple models with different learning rate. However, from the original paper, it says, "we compute the gradient of...
### Your current environment I am currently utilizing vLLM serve to deploy the Qwen-0.5B model on an Nvidia H20 GPU. During this process, I've observed that the GPU utilization as...