Quantizing model at runtime during training results in non-learned quantization

Open FSet89 opened this issue 4 years ago • 1 comments

Prior to filing: check that this should be a bug instead of a feature request. Everything supported, including the compatible versions of TensorFlow, is listed in the overview page of each technique. For example, the overview page of quantization-aware training is here. An issue for anything not supported should be a feature request.

Describe the bug I want to perform quantization-aware training. I tried to start a training with a non-quantized model and to apply quantization at a certain epoch. The training goes well with a good validation accuracy. However, when I load the trained model to perform some testing, it appears that the quantization parameters have not been learned and the quantization is not properly applied to the input, resulting in black images and very bad results.

System information

TensorFlow version (installed from source or binary): 2.4.1 binary

TensorFlow Model Optimization version (installed from source or binary): 0.5.0

Python version: 3.6.9

Describe the expected behavior If the quantization is applied at runtime, the quantization parameters should be learned as if the model was quantized from the start of the training

Describe the current behavior When the quantization is applied at runtime, the training goes well but the saved model does not quantize the input properly

Code to reproduce the issue Relevant code attached.

@tf.function
def train_step(x, y):
	with tf.GradientTape() as tape:
		probs = model(x, training=True)
		loss_value = loss_fn(y, probs)
		
	grads = tape.gradient(loss_value, model.trainable_weights)
	optimizer.apply_gradients(zip(grads, model.trainable_weights))
	train_acc_metric.update_state(y, probs)
	return loss_value, probs

def custom_quantization(layer):
	return tfmot.quantization.keras.quantize_annotate_layer(layer)


# load datasets
train_dataset, val_dataset = get_datasets()

# define new model or load saved model
model = get_model()
loss_fn = tf.keras.losses.CategoricalCrossentropy()
optimizer = tf.keras.optimizers.Adam(learning_rate=0.001)

for epoch in range(1, epochs):
        if epoch == quantization_epoch_start:
		print("Quantizing model...")
		annotated_model = tf.keras.models.clone_model(model, clone_function=custom_quantization)

		with tfmot.quantization.keras.quantize_scope():
			model = tfmot.quantization.keras.quantize_apply(annotated_model)

	for step, (x_batch_train, y_batch_train) in enumerate(train_dataset):		
                total_step += 1
		loss_value, scores = train_step(x_batch_train, y_batch_train)

        if total_step % 2000 == 0:
	        print("saving model...")
        	model.save(os.path.join(hparam_ckpt_path, "step-%d.h5" % (total_step)), save_format='h5')

Oct 26 '21 09:10 FSet89

Hi FSet89@. Sorry for your inconveniences. I just tried out your code on a very simple convolutional model and seems to be running fine. Are you running in eager mode?

The variables should be set by assignment during execution of the model. probs = model(x, training=True). There were some issues with multi-input calls before tfmot==0.6.0, could you try upgrading to 0.7.0 or showing the output of the model summary before and after the quantization?

Or if you could post the h5 models as well that would be more helpful.

Nov 01 '21 04:11 daverim