vonjackustc
vonjackustc
If I'm training a second order FM, how can I fit the intercept? I see "use fit_lower='augment', fit_linear=True" can not give mi an intercept. Thank you!
I set parameters as follows: loss = 'logistic', fit_lower='augment', fit_linear=1, degree=2, n_components=2 My feature number is 29 and len(fm.P_) == 29 and fm.P_.shape == (1, 2, 29). Is there anything...
Thank you for replying! I modified _cd_direct_ho routine. When calling _cd_linear_epoch, it modifies X adding a dummy feature to fit the intercept (_w[0]).
Tried to support it, use BertModel & SPM tokenizer. https://huggingface.co/vonjack/bge-m3-gguf Tested cosine similarity between "中国" and "中华人民共和国": bge-m3-f16: 0.9993230772798457 mxbai-embed-large-v1-f16: 0.7287733321223814
Do you know how to convert .pth model to config.json/pytorch_model.bin for RWKV4neo?
> @vonjackustc I missed these new extra `printf` statements in one of the recent rebases, just integrated your changes to the `tcp_server` branch, thanks for catching it. You can change...
Add cpy fp16 to q8_0 and q8_0 to fp16: https://github.com/ggerganov/llama.cpp/commit/3d92acfb8d41ca4d924743ffa6f7cfba105c23f5 Test on M2 pro (metal backend). I'm not familiar with CUDA, so pls check.
Can conv2d fused support p40 bro?