llama-cpp-python
llama-cpp-python copied to clipboard
Fix multi-sequence embeddings
Fixes multi-sequence (batch) embeddings by handling n_seq_max and kv_unified flags. See discussion in #2051.
@abetlen any updates yet?