MediaPipePyTorch
MediaPipePyTorch copied to clipboard
Bug fix: ill pose results for horizontal hands
Issue
Hand pose estimation yields weird results deviating the correct hand pose.
Reason
The HPE process follows a two-stage pipeline:
- Palm detector yields bounding box for the palm, thus creating a basis for pose regression;
- Expand the palm bbox to a hand bbox, and then conduct pose regression.
The problem is the way how the bbox is expanded.
In detection2roi(), the center point(the pink circle) of hand bbox is the center of palm bbox(the red box) plus a bias in y-axis only. So vertical bias is applied to horizontal hands where horizontal bias is needed.
def detection2roi(self, detection):
if self.detection2roi_method == 'box':
# compute box center and scale
# use mediapipe/calculators/util/detections_to_rects_calculator.cc
xc = (detection[:,1] + detection[:,3]) / 2
yc = (detection[:,0] + detection[:,2]) / 2
scale = (detection[:,3] - detection[:,1]) # assumes square boxes
...
yc += self.dy * scale
scale *= self.dscale
...
Solution
Use the hand direction as guidance for bias direction. This may affect other models, so be careful use it only for the hand model.
Replace detection2roi() with the code below:
def detection2roi(self, detection):
""" Convert detections from detector to an oriented bounding box.
Adapted from:
# mediapipe/modules/face_landmark/face_detection_front_detection_to_roi.pbtxt
The center and size of the box is calculated from the center
of the detected box. Rotation is calcualted from the vector
between kp1 and kp2 relative to theta0. The box is scaled
and shifted by dscale and dy.
"""
# compute box center and scale
# use mediapipe/calculators/util/detections_to_rects_calculator.cc
xc = (detection[:,1] + detection[:,3]) / 2
yc = (detection[:,0] + detection[:,2]) / 2
# such a brutal algorithm...
scale = (detection[:,3] - detection[:,1]) # assumes square boxes
# compute box rotation
"""
In the Case of Palm: kp1 = 0, kp2 = 2 i.e. keypoint 0 & 1
"""
x0 = detection[:,4+2*self.kp1]
y0 = detection[:,4+2*self.kp1+1]
x1 = detection[:,4+2*self.kp2]
y1 = detection[:,4+2*self.kp2+1]
#theta = np.arctan2(y0-y1, x0-x1) - self.theta0
theta = torch.atan2(y0-y1, x0-x1) - self.theta0
# modified: to flexibly change the center shift orientation.
dl = self.dy * scale
dl_x = dl * torch.sin(-theta)
dl_y = dl * torch.cos(-theta)
xc += dl_x
yc += dl_y
scale *= self.dscale * 1.04
return xc, yc, scale, theta
Here is the correct HPE result:
brilliant!