KBLaM and/or fine-tuning?

Open rog77 opened this issue 9 months ago • 1 comments

I am no expert, so please forgive the naive questions, but:

Is there any way to integrate KBLaM into these models?
Is it possible to fine-tune the models as I understand is recommended practice for KBLaM?

Links or information would be greatly appreciated.

Thanks for any response!

May 07 '25 04:05 rog77

I can provide a minimal implementation using TRL.

install the necessary packages:

pip install trl
pip install git+https://github.com/shumingma/transformers.git

a sample code snippet (Please note that it is only a minimal example—hyperparameters such as batch size and learning rate should be tuned for optimal performance):

from trl import SFTConfig, SFTTrainer
from datasets import load_dataset

dataset = load_dataset("trl-lib/Capybara", split="train")

training_args = SFTConfig(
    max_length=2048,
    output_dir="/tmp",
    per_device_train_batch_size=4,
)

trainer = SFTTrainer(
    model="microsoft/bitnet-b1.58-2B-4T-bf16",
    train_dataset=dataset,
    args=training_args,
)

trainer.train()

May 30 '25 06:05 buaahsh