tiny-llm icon indicating copy to clipboard operation
tiny-llm copied to clipboard

A course of learning LLM inference serving on Apple Silicon for systems engineers: build a tiny vLLM + Qwen.

Results 9 tiny-llm issues
Sort by recently updated
recently updated
newest added

Fixes issue where `pdm run main --solution ref --loader week1` would fail with "ModuleNotFoundError: No module named 'tiny_llm_ref'"

When I run `pdm run test-refsol -- -- -k week_1` ``` src/tiny_llm_ref/__init__.py:7: in from .generate import * E File "/Users/gao/develop/tiny-llm/src/tiny_llm_ref/generate.py", line 134 E print(f"+{progress} {text.replace('\n', ' ')[-80:]}") E ^ E...

- Add complete quantized_matmul_impl_typed template function for CPU (float16, float32, and bfloat16). - Add fp32 test cases for quantized_matmul. - Relax float32 tolerance in test utils.

during my own testing and people's feedbacks it seems that some kernels on M1 has precision issues in RoPE. Likely due to sin/cos. https://github.com/skyzh/tiny-llm/issues/27

pytorch's test is okay but I don know whether the mlx part will work

It would be great to add support for single-node, multi-process inference to Tiny LLM.

Thanks for creating this awesome tutorial. It's very helpful!! Just curious do we have any timeline for updating week2+ tasks/tests/docs? Totally understand that authors are busy, but just curious :)