model_optimization convert error in yolo8n SKU-110K

Issue Type

Bug

Source

source

MCT Version

2.0.0

OS Platform and Distribution

Ubuntu 18.04

Python version

3.10.14

Describe the issue

Hello.
I tried to convert yolov8n SKU-110K in MCT with tensorboard.
When I called mct.ptq.keras_post_training_quantization,
I got the following error,
Exception: The model cannot be quantized to meet the specified target resource utilization activation with the value 1638400.

I wonder if it would be possible to tell me how to address this error ?

Expected behaviour

I expect conversion succeed.

Code to reproduce the issue

docker image: ultralytics/ultralytics:latest

git clone https://github.com/sony/model_optimization.git local_mct
cd local_mct
git checkout refs/tags/v2.0.0
pip install -r requirements.txt

[base code]
https://github.com/sony/model_optimization/blob/v2.0.0/tutorials/notebooks/keras/ptq/example_keras_yolov8n.ipynb

[add] 
mct.set_log_folder('./loggerv2')

[modify]
 I think tensorboard_writer.py don't support tf.image.combined_non_max_suppression
model = Model(model.input, outputs, name='yolov8n')
↓
model = Model(model.input, model.output, name='yolov8n')

I used SKU-110K trained model (epochs = 300)
https://docs.ultralytics.com/datasets/detect/sku-110k/#dataset-yaml

I used this pt file and load like
https://github.com/sony/model_optimization/blob/v1.11.0/tutorials/notebooks/example_keras_yolov8n.ipynb

yolov8n.yaml's nc changed from 80 to 1.

Log output

CRITICAL:Model Compression Toolkit:The model cannot be quantized to meet the specified target resource utilization activation with the value 1638400.
Traceback (most recent call last):
  File "/usr/src/ultralytics/myModelConv_product_02_kato_v2_board.py", line 187, in <module>
    quant_model, _ = mct.ptq.keras_post_training_quantization(model,
  File "/usr/src/ultralytics/../local_mct_v2/model_compression_toolkit/ptq/keras/quantization_facade.py", line 134, in keras_post_training_quantization
    tg, bit_widths_config, _ = core_runner(in_model=in_model,
  File "/usr/src/ultralytics/../local_mct_v2/model_compression_toolkit/core/runner.py", line 119, in core_runner
    bit_widths_config = search_bit_width(tg,
  File "/usr/src/ultralytics/../local_mct_v2/model_compression_toolkit/core/common/mixed_precision/mixed_precision_search_facade.py", line 126, in search_bit_width
    result_bit_cfg = search_method_fn(search_manager,
  File "/usr/src/ultralytics/../local_mct_v2/model_compression_toolkit/core/common/mixed_precision/search_methods/linear_programming.py", line 64, in mp_integer_programming_search
    lp_problem = _formalize_problem(layer_to_indicator_vars_mapping,
  File "/usr/src/ultralytics/../local_mct_v2/model_compression_toolkit/core/common/mixed_precision/search_methods/linear_programming.py", line 174, in _formalize_problem
    _add_set_of_ru_constraints(search_manager=search_manager,
  File "/usr/src/ultralytics/../local_mct_v2/model_compression_toolkit/core/common/mixed_precision/search_methods/linear_programming.py", line 231, in _add_set_of_ru_constraints
    Logger.critical(
  File "/usr/src/ultralytics/../local_mct_v2/model_compression_toolkit/logger.py", line 117, in critical
    raise Exception(msg)
Exception: The model cannot be quantized to meet the specified target resource utilization activation with the value 1638400.

May 07 '24 07:05 YoshikiKato0220

This error doesn't occur in MCT v1.11.0. But in MCT v1.11.0, I have a #1055 error.

May 07 '24 07:05 YoshikiKato0220

linear_programing.py

ipdb> aggr_ru [8400.0, 33600.0, 8400.0, 33600.0, 8400.0, 8400.0, 8400.0, 8400.0, 204800.0, 819200.0, 8400.0, 8400.0, 33600.0, 25600.0, 6400.0, 1600.0, 25600.0, 204800.0, 819200.0, 1600.0, 6400.0, 25600.0, 102400.0, 204800.0, 409600.0, 204800.0, 819200.0, 25600.0, 204800.0, 819200.0, 153600.0, 51200.0, 51200.0, 102400.0, 153600.0, 51200.0, 102400.0, 307200.0, 102400.0, 204800.0, 204800.0, 307200.0, 102400.0, 204800.0, 614400.0, 204800.0, 409600.0, 409600.0, 1228800.0, 819200.0, 409600.0, 307200.0, 102400.0, 204800.0, 204800.0, 614400.0, 409600.0, 204800.0, 204800.0, 51200.0, 102400.0, 153600.0, 51200.0, 51200.0, 51200.0, 102400.0, 102400.0, 409600.0, 102400.0, 102400.0, 102400.0, 102400.0, 102400.0, 204800.0, 204800.0, 204800.0, 819200.0, 204800.0, 204800.0, 204800.0, 204800.0, 204800.0, 409600.0, 409600.0, 819200.0, 819200.0, 1228800.0, 409600.0, 409600.0, 819200.0, 819200.0, 1638400.0, 3276800.0, 1228800.0, 102400.0, 409600.0, 102400.0, 409600.0, 102400.0, 204800.0, 102400.0, 102400.0, 204800.0, 409600.0, 409600.0, 819200.0, 1638400.0, 400.0, 1600.0, 6400.0, 25600.0, 6400.0, 1600.0, 25600.0, 102400.0, 409600.0, 25600.0, 102400.0, 409600.0, 25600.0, 102400.0, 409600.0, 25600.0, 102400.0, 409600.0] ipdb> p v 3276800.0

May 07 '24 09:05 YoshikiKato0220

Hi @YoshikiKato0220 ,

According to the error message you are getting, It seems that you are trying to run mixed precision quantization to quantize the model's activations to a specific target memory size. The tutorial that you are running is not set to run this type of quantization, so I will need to know whether you made any changes to the code using the MCT, in order to figure out what could be the problem.

If you did try to run activation mixed precision, it is possible that the problem is that you provide the MCT with a memory restriction that is too low, such that the maximal activation tensor memory size can not be reduced to the specified target. If this is the case, I would suggest to try and provide a looser restriction on the activation memory size.

Let us know if this is helpful and if you need any other help with this issue.

May 09 '24 17:05 ofirgo

Hi @ofirgo , Thanks for your comment, I only changed the code in terms of SKU-110K (not COCO).

Then I changed the code in accordance with your advise.

resource_utilization = mct.core.ResourceUtilization(resource_utilization_data.weights_memory * 0.75) ↓ resource_utilization = mct.core.ResourceUtilization(resource_utilization_data.weights_memory * 0.75, 3276800)

it worked. Thank you.

Could you tell me the effect when activation_memory is increased. Do the quantized model make less compression ratio or less accuracy ?

May 10 '24 09:05 YoshikiKato0220

Hi @ofirgo , Thanks for your comment, I only changed the code in terms of SKU-110K (not COCO).

Then I changed the code in accordance with your advise.

resource_utilization = mct.core.ResourceUtilization(resource_utilization_data.weights_memory * 0.75) ↓ resource_utilization = mct.core.ResourceUtilization(resource_utilization_data.weights_memory * 0.75, 3276800)

it worked. Thank you.

Could you tell me the effect when activation_memory is increased. Do the quantized model make less compression ratio or less accuracy ?

I'm glad to hear this solves the issue for you. Regardless, I'm still not sure what caused the original issue, since you say you didn't change anything from the original tutorial code but the dataset. Do you know how to explain where the restriction of "1638400" activation memory came from in your original code that had the issue? Because, if you didn't provide it, then the expected behavior should be that the activation memory size is not restricted and you shouldn't have experienced the error.

What happened when you modified the resource_utilization call is that you added a restriction on the memory of the maximal activation tensor during model inference to 3276800. This is the size of the actual maximal activation tensor (according to your previous message), so this modification didn't affect the accuracy or the memory of the quantized model.

I'm keeping the issue open for now. I would appreciate it if you can provide more information regarding the way you originally called MCT (that causes the issue).

Thank you for raising this issue and for your help.

May 12 '24 06:05 ofirgo

Hi @ofirgo I'm sorry to have troubled you. my code was wrong.

I had set resource_utilization_data not resource_utilization in keras_post_training_quantization's 3rd argment.

May 13 '24 08:05 YoshikiKato0220