rwkv.cpp 基本支持RWKV6

基本支持RWKV6
转换，加载，lora好了，但是计算图有问题，rwkv_graph.inc:348行附近怎么都改不好，我打了感叹号标记

Apr 04 '24 12:04 YuChuXi

Hi!

Although I've stepped down as the maintainer of rwkv.cpp, I think my input will still be valuable:

~~rwkv_att_v6 looks like a naive (as in "creates and operates on full-blown matrices") implementation of the attention. If it's true, then v6 models will be too slow and probably unusable on large context lengths. See comparisons for v5. I recommend pulling an optimized implementation from somewhere and adapting it into rwkv.cpp, as was done with v5.~~
there are no tests for v6 models, no Tiny RWKV trained, etc. I would not merge v6 support without proper tests added, because it is asking for something to be unknowingly broken.

Edit: I may be mistaken about performance issues, because rwkv_att_v6 actually calls rwkv_wkv_v5. In any case, I recommend doing latency/memory usage measurements and comparing them with v5 models.

The second point still stands -- there must be quality assurance.

Apr 06 '24 14:04 saharNooby

Hey there! Thanks for the pull request for the long-awaited RWKV V6 support. I'll soon get to reviewing the code, it's on my schedule! Unfortunately I cannot read Chinese, so I've had to get some help translating. As already mentioned, to make sure support is (and continues to be) functional, we need to include some tests.

Apr 09 '24 08:04 LaylBongers

Sorry, it's not convenient for me to reply at school. This pr only supports the conversion and loading of RWKV6. The file "rwkv_graph.inc" has an issue near lines 342-347 that I cannot solve. No matter how I modify it, I may encounter errors such as "GGML_ASSERT: /media/yuchuxi/YuZi/Project/Mozi/rwkv.cpp/ggml/src/ggml.c:6499: ggml_is_contiguous(a) "

Apr 10 '24 09:04 YuChuXi

I'll go find a small model to test

Apr 10 '24 09:04 YuChuXi

rwkv-x060-173m-pile-20240515-ctx4k.pth

This is a small model

Jun 01 '24 11:06 YuChuXi

RWKV v6 support was merged in other PR.

Jul 05 '24 07:07 saharNooby