Weiting Huang1,2, Pengfei Ren1,2*, Jingyu Wang1,2, Qi Qi1,2, Haifeng Sun1,2
1State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, 2EBUPT Information Technology Co., Ltd.

We introduce adaptive weighting regression (AWR) method. The weight distribution in weight maps can be adjusted adaptively to achieve more accurate and robust performance under the guidance of joint supervision. Top row: When the target joint is visible and easy to distinguish, the weight distribution of AWR tends to focus more on pixels around it, as standard detection-based methods do. Middle row & bottom row: When depth values around the target joint are heavily missing due to occlusion or under the situation of severe self-similarity among fingers, the weight distribution spreads out to capture information of adjacent joints.
Overview

We propose an adaptive weighting regression (AWR) method to leverage the advantages of both detection-based and regression-based method.

Guided by adaptive weight maps, AWR aggregates different regions of dense representation through discrete integration of all pixels in it. This operation is differentiable so that it can be embedded into the network for end-to-end training and applies direct supervision on joint coordinates, drawing consensus in network’s supervision and output.
Comparison with SOTA

On HANDS 2017 dataset, our Resnet18 based method already exceeds previous state-of-the-art methods by a large margin. And our Resnet50 based method further improves the average mean joint error by 0.36mm.

On NYU, ICVL and MSRA dataset, our method outperforms all existing methods on the three 3D hand pose estimation datasets using either the per-joint and all-joint mean error or the proportion of good frames.
Conclusion
1️⃣ Presenting an adaptive weighting regression (AWR) method to aggregate dense representation through discrete integration.
2️⃣ AWR unifies the dense representation and hand joint regression to enable direct supervision on joint coordinates, narrowing the gap between training and inferencing.
3️⃣ Comprehensive exploration experiments have been done to validate the improvement in network’s accuracy and robustness brought by AWR as well as its generality to work under various experimental settings.
4️⃣ The overall network is simple yet effective and achieves state-of-the-art performance on four publicly available datasets.
Bibtex
@inproceedings{awr,
title={AWR: Adaptive Weighting Regression for 3D Hand Pose Estimation},
author={Weiting Huang and Pengfei Ren and Jingyu Wang and Qi Qi and Haifeng Sun},
booktitle={AAAI Conference on Artificial Intelligence (AAAI)},
year={2020}
}