aihwkit Jart synapse

Description

Support a new device type based on JART V1b memristor model and test scripts for that. Details can be found at https://ieeexplore.ieee.org/document/10052010

Details

The code was written for the old backend and some adaptations for the new version are needed. Support for CUDA is strange, where sometimes the results are not transferred back to PyTorch. Cycle-to-cycle variability is currently not supported for CUDA, as we need fields to save more device-specific parameters.

Dec 28 '23 17:12 maljoras

@ZhenmingYu I tried to change it but I think there are still a number of issues how the synapse is simulated. The logic to add the cycle-to-cycle noise is not correct. The cycle-to-cycle noise should be computed before updating the model as far as I can see, however, for instance the "ratio" parameter is always zero which cannot be correct.

Dec 28 '23 21:12 maljoras

@ZhenmingYu I tried to change it but I think there are still a number of issues how the synapse is simulated. The logic to add the cycle-to-cycle noise is not correct. The cycle-to-cycle noise should be computed before updating the model as far as I can see, however, for instance the "ratio" parameter is always zero which cannot be correct.

@maljoras Thanks for the checking the code. I just came back from vacation and started looking at this.

You are right that "ratio" is 0 at most times, it was carried from the original model, fitted mostly with data collected in IV sweeps, and the purpose of this parameter is to scale the c2c noise with the update. If the update is large, then the filament area is disrupted heavily, which means that the c2c noise will be large. If the update is small, you are not quite changing the conductive filament, and the c2c noise will be 0.

However, in our use case, the update is small most of the time, so you will observe "ratio" parameter is mostly close to 0, and will render c2c noise useless, as shown in https://ieeexplore.ieee.org/document/10052010 Figure 3 (f) and (g), the multiplicative noise (i.e.) scaled by "ratio" has no impact on the network performance.

Jan 16 '24 15:01 ZhenmingYu

@maljoras I am also a bit lost in how should we bring this project forward, and your advice on this would be very appreciated.

I see that you are trying to simplify the computation by directly applying noise on dNdt. While this will work, it is not so much physically accurate anymore, which is where the JART model reeally shines.

On the other hand, In the past month I have been experimenting with a new mathmatically fit of the model, which will include read noise and make everything more accurate. However, it will include even more global parameters that will futher drag down the already too slow cuda initilization step.

Should we just drop the CUDA version instead?

Jan 16 '24 15:01 ZhenmingYu

@ZhenmingYu and @maljoras, there are build errors. Can you take a look.

Feb 12 '24 15:02 kaoutar55

Hi @kaoutar55, this PR is still WIP there are a number of issues still.

Feb 12 '24 16:02 maljoras