ZagButNoZig
ZagButNoZig
Hello, I'm not entirely sure this is a bug or if this is me just misusing the API. I am training a pretty simple full connected layer with some data...
With the latest git driver and an Archer T3U Plus I get pretty fast internet speed (almost as fast as my phone over WiFi).  But loading a new web...
# Bug report **Current behavior:** When playing offline against Stockfish you can make a move and undo it while SF is still "thinking". This will lead to SF making am...
This a draft for making the serialization of a world deterministic by adding the necessary information to the serialization. It works by saving the free list and all the generations...
The [reference implementation](https://github.com/KellerJordan/Muon) of Muon supports 4D parameters (like conv filters) through flattening [here](https://github.com/KellerJordan/Muon/blob/12dc5336852ec10fbe1a1119ad09c92d2d81505e/muon.py#L97). As far as I can see the current optax Implementation does not, because of the [partitioning...
First of all, thank you for your great library! ### The problem - Initialize a model with multiple batch norms (e.g. ResNet) - Set model to inference mode - Observe...
I'm playing around with the [U-Mamba](https://github.com/bowang-lab/U-Mamba/blob/main/umamba/nnunetv2/nets/UMambaEnc_3d.py) model and I get out of memory issues when replacing ```Python self.mamba = Mamba( d_model=dim, # Model dimension d_model d_state=d_state, # SSM state expansion...