Paras Stefanopoulos

Results 4 comments of Paras Stefanopoulos

Once upon a time, in a land not so far away, there was a programmer named Paras who was responsible for maintaining a critical service using PM2. Paras was a...

I think a modifier + any-navigation keys applying to the current view would be good. `ctrl + ,` or `ctrl + .` applying to current view would be great. I...

What I did to get Qwen3 working with CP: 1. Added an alias property to the Qwen3 model class for `freqs_cis` -> `rope_cache` ``` @property def freqs_cis(self) -> torch.Tensor: """Alias...

I have tried to create a minimal reproducible example but spent an hour with the debug sized model but couldn't get it (I was only facing this getting qwen3-235b training...