Julian Lenz
Results
2
comments of
Julian Lenz
> implement the basic micro-timing (without overlapping beat_res) in the microtiming-draft branch and make it work, then pull these changes to this branch and make the few adaptations, that shouldn't...
For reference, a HF Llama3-style model with roughly the same params is generating the 500 tokens in ~15 seconds