LIU, Jiazhen
LIU, Jiazhen
> @sycophant-stone 我注意到在Loss_ASoftmax.py中,最后返回的logits是原始logits,而不是updated_logits。实际上如果返回updated_logits和其他几位实现的一样的话,在计算每一个batch的分类正确率时很不正常...那到底应该返回哪一个呢? 我在另一个代码实现中看到他返回的是updated_logits,但是updated_logits必须喂入标签数据,我如果想单纯进行预测,不知道标签数据该怎么办。我试了试将另一篇的代码改成返回原始logits,结果准确率远远低于返回updated_logits
The warning comes from the ViT component, which is already frozen, so you can ignore it. It’s related to gradient checkpointing, which is used to save memory. You’ll notice that...
```python conv_qwen_2 = Conversation( system="A chat between a curious user and an artificial intelligence assistant. " "The assistant gives helpful, detailed, and polite answers to the user's questions.", roles=("USER", "ASSISTANT"),...
I am currently trying to run KTransformer (Deepseek-R1) + Generic OpenAI Connector with AnythingLLM on an Apple M1 computer. While the server shows normal responses, the client keeps getting stuck...
There is a stop reason like this.
I'm sure it's because there is no finish_reason in the last chunk. I simply fix it by adding the reason in the last chunk as follows (in `ktransformers/server/schemas/assistants/streaming.py`):
This issue occurs whenever a fixed seed is provided in the main process.