lit-llama icon indicating copy to clipboard operation
lit-llama copied to clipboard

Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Adapter fine-tuning, pre-training. Apache 2.0-licensed.

Results 107 lit-llama issues
Sort by recently updated
recently updated
newest added

I want to train the 13B Lllama but with 8bit quantization LoRA. Rn it takes 70GB of GPU RAM which is quite a lot. I'm using 8xA100-80GB. `lora.py` ``` #...

Hi, I'm confused about where to find the tokenizer: --tokenizer_path checkpoints/lit-llama/tokenizer.model Referring here to the readme: ![image](https://github.com/Lightning-AI/lit-llama/assets/20338794/1ce080f0-be94-449f-b591-d63e1580071d) Where can I download it?

If I want to use llama3 through lit lama, how can I modify it? I found that the model structure of llama3 has changed

Hi I have pretrained a model and have it in lit-llama format. How can I convert it to huggingface format? I need to load my pretrained model via HuggingFace for...

When i invoke the generate function twice with different idx,it appears the fault described as "RuntimeError: The expanded size of the tensor (181) must match the existing size (168) at...

Hello, Thanks for the great work! I have a pre-trained Lit-Llama checkpoint that I'd like to convert to a format supported by HF, so that I could use it as...

upon every restart of finetune i see: "train data seems to have changed. restarting shuffled epoch." i looked up where it happens, added debugging line and it turned out that...

I noticed that `PackedDatasetBuilder` does not separate the tokens with `sep_token`. To illustrate, referencing https://github.com/Lightning-AI/lit-llama/blob/da71adea0970d6d950fb966d365cfb428aef8298/scripts/prepare_redpajama.py#L71 ```py builder = packed_dataset.PackedDatasetBuilder( outdir=destination_path, prefix=prefix, chunk_size=chunk_size, sep_token=tokenizer.bos_id, dtype="auto", vocab_size=tokenizer.vocab_size, ) ``` and https://github.com/Lightning-AI/lit-llama/blob/da71adea0970d6d950fb966d365cfb428aef8298/scripts/prepare_redpajama.py#L85 ```py...

When conducting generation for multiple consecutive inputs on a LoRA fine-tuned LLaMA, I noticed that using 'reset_cache' after each generation for one input will affect the performance of generation on...

How to train a Llama using TPUs?