TokenPacker icon indicating copy to clipboard operation
TokenPacker copied to clipboard

The code for "TokenPacker: Efficient Visual Projector for Multimodal LLM", IJCV2025

Results 8 TokenPacker issues
Sort by recently updated
recently updated
newest added

Hello, I would like to know if the inference times reported in Figure 4 are measured under NO KV cache? While the "TPS" results in Table 3 are prefill time...

Hello, great work. I encountered a problem in the core code: ` File "/tmp/pycharm_project_858/m_llava/model/multimodal_projector/builder.py", line 112, in forward key = self.ln_k_1(self.k_proj_1(x_multi)).permute(1, 0, 2)` `RuntimeError: mat1 and mat2 shapes cannot be...

As a general visual projector, I'd like to ask whether you have conducted any experiments on other visual backbone. I extracted features from the siglip [17, 18, 26, 27] layers...

I appreciate your work. Would you release the checkpoints of TokenPacker-7b-36token and TokenPacker-7b-64token?

能给个单独使用TokenPacker得用例吗?

Hi, thank you for your great work on TokenPacker! I’m trying to reproduce the **TokenPacker-HD (7B, scale factor 2, patch number 9)** experiments, but I’m not getting results close to...

Hello author, I have reproduced your paper, but when using your 7B+144Token model on the VQAv2 dataset, I obtained a result of 77.12, which significantly differs from the 77.9 reported...