Add qualitative performance validation
This helps find out which methods work best, which models it works with, and so on. For example this show that middle layers work best, it only works on model >4B, and diff_pca is best
it only works on model >4B
Just in case: have you checked wether the error message Error at layer 27: 'Qwen3DecoderLayer' object has no attribute 'set_control' repeated over and over at the bottom is an issue here?
oh yeah that is just because those layers were not transformed into repeng control blocks. And I just used a try except pattern over all layers
With small models, they are kind of incoherent to start with, and I suspect they don't have well-developed inner concepts, making them harder to steer.