Hasan Arif

Results 2 issues of Hasan Arif

Do we have a script to run the benchmarks mentioned in the paper? Especially for llama 7B, 13B models. Also, I can not run the llama-bench and I am getting...

question

How do I reproduce? ``` import torch from transformers import AutoModelForCausalLM, LlamaForCausalLM, AutoTokenizer, AutoConfig, LlamaTokenizer, LlamaConfig from transformers.modeling_utils import load_sharded_checkpoint from accelerate import init_empty_weights, load_checkpoint_in_model, load_checkpoint_and_dispatch from utils_hh.modify_llama import convert_kvcache_llama_heavy_recent,...