Paras Stefanopoulos
Paras Stefanopoulos
Once upon a time, in a land not so far away, there was a programmer named Paras who was responsible for maintaining a critical service using PM2. Paras was a...
I think a modifier + any-navigation keys applying to the current view would be good. `ctrl + ,` or `ctrl + .` applying to current view would be great. I...
What I did to get Qwen3 working with CP: 1. Added an alias property to the Qwen3 model class for `freqs_cis` -> `rope_cache` ``` @property def freqs_cis(self) -> torch.Tensor: """Alias...
I have tried to create a minimal reproducible example but spent an hour with the debug sized model but couldn't get it (I was only facing this getting qwen3-235b training...