Julian Lenz comments

Repositories
Issues
Comments

Results 2 comments of


                                            Julian Lenz

PerTok Tokenizer

> implement the basic micro-timing (without overlapping beat_res) in the microtiming-draft branch and make it work, then pull these changes to this branch and make the few adaptations, that shouldn't...

Inference Time

For reference, a HF Llama3-style model with roughly the same params is generating the 500 tokens in ~15 seconds