aria
aria copied to clipboard
Live player
- [x] Added a live player (borrowed a lot of codes from Max)
- [x] Implemented a sliding window for generation
-roll .... It rolls the kv cache when the update position is larger than length. - [x] Improved some codes and now
greedy_samplereturns an iterator with the option of streaming tokens Problem: - [x] The sliding window generation degenerates very quickly. Probably needs careful debugging to see whether this is from my bug.
Also, we can probably have Max's interactive session in a separate script (since there is no prompt, doesn't fit for the aria.run sample script). We can refactor Max's code #79 after this PR.