Leo Zhang
Leo Zhang
grad = (1 / m) * sum((h - y) .* X) can be instead of X' * (h - y) / m Maybe the J also can be optimized in...
/!\ PLEASE INCLUDE THE FULL STACKTRACE AND CODE SNIPPET **Short description** When **Environment information** * Operating System: * Python version: * `tensorflow` version: * `tensorflow-datasets` version: * `tf-nightly` version: *...
[/docs/media/16613396005977.jpg](https://github.com/THUDM/GLM-130B/blob/main/docs/media/16613396005977.jpg) 请问这个图片的横轴和纵轴的边界值是多少啊 分别表示了哪些含义呀 (横轴是4096表示隐藏层size吗
To make a long story short, I have found that the different **process_prompt** cause problems with the tokenization of **\n**, probably due to the std::string. The first process ``` if...
requirements.txt 缺失 还怎么pip install -r requirements.txt
I see the support for llama.cpp, but I don't know how to run moondream2
Can you share the application link for these datasets?
add --single-pred-prompt
add_special_tokens for phi 2
fix eval script