bugfix for dynamic-shape backedge for static input shape in loop operator intel GPU
Fixed:
This PR include bugfix for the loop operator when the different shape of input is preformed on different iteration to a loop operation with static input shape.
Cause:
The issued commit introduces a consideration of memory predictor in function named set_memory_in_body_network in case where memory buffer has been over-allocated by shape predictor, memory layout might be unexpected shape. So when handling backedge memory copy for next iteration, memory layout is re-interpreted according to original layout .
But in this senario in TF_Faster_RCNN_Inception_ResNet_v2 , when batchsize is 2, the loop is not unrolled for each iteration deal with one batch, shown in picture below, the broadcast is used to create a array and each iteration write a part of that array. with the set_memory_in_body_network function, the 2nd iteration's input with the generated array is cutted off, which lose the first batch of data.
Solution:
The bugfix functions in two places, first in graph generation and second in runtime.
in graph generation phrase the shape of input primitive take consider of the shape of from-node's backedge and mark it dynamic according to the from-node's shape. in the runtime, the set_memory_in_body_network will preform according to the shape of both sides of shape and compare with the pre-allocation memory which matching SINGLE_SHARED type.
A testcase is added to test this behavior.
this test case can not be passed on the issued commit( 236e1062b290e2d2345f1d1c319e78f15e0a311d) while can be passed when doing the change in the mentioned PR.
Tickets:
- CVS-143684
1. functional test works after merge
when merging to the master(5a119fb2498f798571d58b0cb21bb8ede8bcf271) branch, the functional test case added above can pass.
2. existing issues on master commit:
but new error occurs in the e2e test for Faster_RCNN
in FP32 test
status = comparators.report_statuses()
> assert status, "inferred model results != reference results"
E AssertionError: inferred model results != reference results
E assert False
test.py:234: AssertionError
in FP16 test
def apply(self, data):
"""Parse object detection data."""
predictions = {}
postprocessed = False
target_layers = self.target_layers if self.target_layers else data.keys()
dict_keys = ['class', 'prob', 'xmin', 'ymin', 'xmax', 'ymax']
for layer in target_layers:
predictions[layer] = []
layer_data = np.squeeze(data[layer])
1 detection leads to 0-d array after squeeze, which is not iterable
if layer_data.ndim == 1:
layer_data = np.expand_dims(layer_data, axis=0)
assert len(layer_data.shape) <= 2, "Wrong data for postprocessing! Data length must be equal 2."
for obj in layer_data:
if type(obj) == np.float64:
log.debug(f" {obj}
has type np.float64")
break
> elif obj[0] == -1:
E IndexError: index 0 is out of bounds for axis 0 with size 0
../utils/e2e/postprocessors/object_detection.py:63: IndexError
3. e2e test on release/2024.3
Some other test on other commits is runThe FP32 test can be passed in FP32 test in release/2024.3 but FP16 test failed with accuracy check in FP16 in release 2024.3
Code is cleaned with the featured e2e test pass and testcase passed.
This PR will be closed in a week because of 2 weeks of no activity.
Plz don't close the PR because a merge to latest master including new bug fix is in progress
@timxu826 , please rebase code
@timxu826 , please rebase code
Sorry for the late reply, will do ASAP thanks.
branch has been updated
This PR will be closed in a week because of 2 weeks of no activity.
The func test will fail because when b_data_braodcast is set to static shape(with a scalar), the transformation pipeline will optimized the input of b_mul2 (multiply op) to a scalar, which will truncate the changed shape (a vector) and produce a wrong result. insert a Reshape op to force the multiply op to a broadcasted shape to bypass the optimization(and draw a diagram to illstrate the flow). and now the test can pass.
build_jenkins