hx507

Results 5 comments of hx507

> Looking further, it also slowly creeps up as prompt being read(batch size = 4) To add another observation, the amount of memory increase per iteration seems to scale quadratically...

> os.system(f"./quantize {os.path.join('models', sys.argv[1], i)} {os.path.join('models', sys.argv[1], i.replace('f16', 'q4_0'))} 2") Consider using something like `subprocess.call` to prevent security issues like command injections in filename.

Also seeing the same issue where llava from ollama performs significantly worse than other web hosted version. > I loaded lava 7b with version 0.1.32 and I get a good...

Interestingly, restarting ollama server makes the first image query work. For anything other than the first image query uploaded (even with a fresh client session), the model will just output...

Looking at the release note of 0.1.34 I think this is already addressed: > - Fixed issues with LLaVa models where they would respond incorrectly after the first request Seems...