tinyml icon indicating copy to clipboard operation
tinyml copied to clipboard

How to tranform module into onnx format?

Open EvayJacksonChen opened this issue 3 years ago • 9 comments

Hi, I recognized that we have ckpt and json files there and Im trying to transform the module into onnx format but I cant find the corresponding neural network file here. So I`m wondering how could I implement this.

EvayJacksonChen avatar May 21 '22 09:05 EvayJacksonChen

So Im trying to deploy the given module on my device, the tflite files certainly fits the device but I think ckpt and json should be transformed into onnx format or it wont suit the board..

EvayJacksonChen avatar May 21 '22 09:05 EvayJacksonChen

I successfully transformed model into onnx format, but its flash and SRAM both beyond my board. Is this because of my lack of Tinyengine? Or I didn`t compress the model in a proper way?

EvayJacksonChen avatar May 21 '22 15:05 EvayJacksonChen

BeyondMemory It spilt over like this.

EvayJacksonChen avatar May 21 '22 15:05 EvayJacksonChen

Hi, thanks for reaching out. Is the converted onnx file quantized to int8? Quantization will significantly reduce memory usage. The tflite file should be quantized and maybe you can try and see if it works.

tonylins avatar May 21 '22 20:05 tonylins

Hi, thanks for your nice response. Actually thats exactly what I missed. I didnt quantize the module. So today I tried a lot of ways to convert onnx file into int8, unfortunately it couldnt fit the 32CubeMX(the screenshot is below). After searching on the documentation, I found it unbelievable that the platform seems only fit quantized keras and TFLite modules. But I believe its the thing we need to do, so I`m gonna convert TFLite files tomorrow and I believe this works. image

EvayJacksonChen avatar May 22 '22 16:05 EvayJacksonChen

Hi, its great to use the "generate_tflite.py" to convert the model and I found that in this procedure we can quantize the model meanwhile. And what we got from this seemed similar to the tflite file which was given. So we also tested tflite files on the platform, its great to see that we reduced Flash to a proper scale but the SRAM still overflowed some. For example, if we tested the "mcunet-320kb-1mb" model, it overflows but if we tested "mcunet-256kb-1mb", it fits. However, for our device is STM32F746,which has 320kb SRAM and 1MB Flash, so I believe the former should fit. Since tflite is the quantized model , what else shall we do to reduce the spilt SRAM? Or didn`t we quantize it enough? image image

EvayJacksonChen avatar May 24 '22 16:05 EvayJacksonChen

Its so weird that the "mcunet-320kb-1mb" even needs more SRAM than the "mcunet-512kb-2mb". We just tested the bigger one "mcunet-521kb-2mb", and it surprised us that it only occupies 416.75KB, which is surprisingly smaller than "mcunet-320kb-1mb"s 467.03KB. image Maybe there`s some unusual things with "mcunet-320kb-1mb_imagenet.tflite" file.

EvayJacksonChen avatar May 24 '22 16:05 EvayJacksonChen

Hi, the memory usage is dependent on the system stack. We used TinyEngine in our experiments, which will have a different memory usage compared to Cube AI, so it should be normal if the peak memory does not align. The 320KB model should fit the device with TinyEngine, but may not for Cube AI.

tonylins avatar May 25 '22 19:05 tonylins

Hi, that definitely makes sense. Thanks for your response. And then were gonna try to deploy the adaptive model on our device to implement some functions on it just like what youve showed in your Demo video which is really cool!

EvayJacksonChen avatar May 26 '22 16:05 EvayJacksonChen