Alignment Regression Hand Pose Estimation Network Based on Focused Attention Mechanism

doi:10.12146/j.issn.2095-3135.20241030001

Home > Archive>Volume 14, Issue 3, 2025 >64-77. DOI:10.12146/j.issn.2095-3135.20241030001

Alignment Regression Hand Pose Estimation Network Based on Focused Attention Mechanism
DOI:
                        10.12146/j.issn.2095-3135.20241030001
                    
CSTR:
                        32239.14.j.issn.2095-3135.20241030001
                    
Author:
                        
                        
                    
Affiliation:
Clc Number:TP399
Fund Project:

Article

Figures

Metrics

Reference

Cited by

Materials

Comments

Abstract:

Hand pose estimation based on RGB images holds wide application prospects in dynamic gesture recognition and human-computer interaction. However, existing methods face challenges such as high hand self-similarity and densely distributed keypoints, making it difficult to achieve high-precision predictions with low computational costs, thereby limiting their performance in complex scenarios. To address these challenges, this paper proposes a 2D hand pose estimation model named FAR-HandNet, based on the YOLOv8 network. The model ingeniously integrates a focused linear attention module, a keypoint alignment strategy, and a regression residual fitting module, effectively enhancing feature capture capabilities for small target regions (e.g., hands) while mitigating the adverse effects of self-similarity on the localization accuracy of hand keypoints. Additionally, the regression residual fitting module leverages a flow-based generative model to fit the residual distribution of keypoints, significantly improving regression precision. Experiments were conducted on the Carnegie Mellon University panorama dataset (CMU) and the FreiHAND dataset. Results demonstrate that FAR-HandNet exhibits remarkable advantages in parameter size and computational efficiency. Compared to existing methods, it achieves superior performance in the percentage of correct keypoints under varying thresholds. Furthermore, the model achieves an inference time of only 32 ms. Ablation studies further validate the effectiveness of each module, conclusively verifying the efficacy and superiority of FAR-HandNet in hand pose estimation tasks.

Reference

Cited by

Get Citation

DOU Mingyang, GENG Yanjuan, YANG Jiabin. Alignment Regression Hand Pose Estimation Network Based on Focused Attention Mechanism[J]. Journal of Integration Technology,2025,14(3):64-77

Copy

Article Metrics

Abstract:
PDF:
HTML:
Cited by:

History

Received:October 30,2024
Revised:March 11,2025
Adopted:
Online: May 09,2025
Published:

Home

About Journal

Editorial Team

Author Center

Peer Review

Reader Center

Ethics

Contact us

中文

Get Citation

Share

Article Metrics

History

Article QR Code