TokenPacker issues

About the inference times reported in Figure 4 and Table 3

3

Hello, I would like to know if the inference times reported in Figure 4 are measured under NO KV cache? While the "TPS" results in Table 3 are prefill time...

Osilly

How does TokenPacker deal with cls_patch select strategy, it seems that is can only deal with only-patch select strategy

Yang-bug-star

mat1 and mat2 shapes cannot be multiplied (576x1024 and 4096x1024)

2

Hello, great work. I encountered a problem in the core code： ` File "/tmp/pycharm_project_858/m_llava/model/multimodal_projector/builder.py", line 112, in forward key = self.ln_k_1(self.k_proj_1(x_multi)).permute(1, 0, 2)` `RuntimeError: mat1 and mat2 shapes cannot be...

zhayert

Visual projector for other VIT

1

As a general visual projector, I'd like to ask whether you have conducted any experiments on other visual backbone. I extracted features from the siglip [17, 18, 26, 27] layers...

codefanw

TokenPacker 36token&64token checkpoints

I appreciate your work. Would you release the checkpoints of TokenPacker-7b-36token and TokenPacker-7b-64token?

fmy7834

能给个单独使用TokenPacker得用例吗？

1

能给个单独使用TokenPacker得用例吗？

sihuaiwei

Unable to reproduce benchmark results for TokenPacker-HD (7B, Scale=2, Patch=9)

Hi, thank you for your great work on TokenPacker! I’m trying to reproduce the **TokenPacker-HD (7B, scale factor 2, patch number 9)** experiments, but I’m not getting results close to...

worapob841

Discrepancy in Reproduced Results on VQAv2: 77.12 vs. Reported 77.9

Hello author, I have reproduced your paper, but when using your 7B+144Token model on the VQAv2 dataset, I obtained a result of 77.12, which significantly differs from the 77.9 reported...

creating001

TokenPacker
TokenPacker copied to clipboard

Metadata

About the inference times reported in Figure 4 and Table 3

How does TokenPacker deal with cls_patch select strategy, it seems that is can only deal with only-patch select strategy

mat1 and mat2 shapes cannot be multiplied (576x1024 and 4096x1024)

Visual projector for other VIT

TokenPacker 36token&64token checkpoints

能给个单独使用TokenPacker得用例吗？

Unable to reproduce benchmark results for TokenPacker-HD (7B, Scale=2, Patch=9)

Discrepancy in Reproduced Results on VQAv2: 77.12 vs. Reported 77.9

← Metadata

Owner

Metadata

TokenPacker TokenPacker copied to clipboard

Metadata

← Metadata

Owner

Metadata

TokenPacker
TokenPacker copied to clipboard