结合不确定性估计的轻量级人体关键点检测算法-AET-电子技术应用

结合不确定性估计的轻量级人体关键点检测算法

电子技术应用

王亚东，秦会斌

（杭州电子科技大学新型电子器件与应用研究所，浙江杭州 310018）

摘要： 人体关键点检测在智能视频监控、人机交互等领域有重要应用。针对基于热图的人体关键点检测算法依赖高分辨率热图、计算资源消耗大的问题，提出一种结合不确定性估计的轻量级算法。使用低分辨率热图，结合不确定性估计预测误差分布的尺度参数，提高了预测结果的可信度；利用尺度参数监督和约束热图，缓解梯度消失，增强了网络的鲁棒性。COCO数据集上实验结果表明，与积分姿态回归算法相比，改进后算法的平均精度提高了3.3%，降低了资源占用。

关键词： 人体关键点检测不确定性估计轻量级积分姿态回归

中图分类号：TP391 文献标志码：A DOI: 10.16157/j.issn.0258-7998.233938
中文引用格式： 王亚东，秦会斌. 结合不确定性估计的轻量级人体关键点检测算法[J]. 电子技术应用，2023，49(10)：40-45.
英文引用格式： Wang Yadong，Qin Huibin. Lightweight human key point detection algorithm with uncertainty[J]. Application of Electronic Technique，2023，49(10)：40-45.

Lightweight human key point detection algorithm with uncertainty

Wang Yadong，Qin Huibin

(Institute of New Electron Device and Application， Hangzhou Dianzi University， Hangzhou 310018， China)

Abstract： Human key point detection has important applications in intelligent video surveillance, human-computer interaction and other fields. Aiming at the problem that the human key point detection algorithm based on heatmap depends on high-resolution heatmap and consumes large computational resources, a lightweight algorithm combined with uncertainty estimation is proposed. The reliability of prediction results is improved by using low resolution heatmap and combining uncertainty to estimate the scale parameters of prediction error distribution. The scale parameter is used to monitor and constrain the heatmap to alleviate the gradient disappearance and enhance the robustness of the network. The experiments on COCO dataset show that the average accuracy of the improved algorithm is improved by 3.3% and the resource occupation is reduced compared with integral pose regression.

Key words : human key point detection；uncertainty estimation；lightweight；integral pose regression（IPR）

0　引言

随着社会发展，监控视频分析正从人工走向智能，从传统走向现代。人体关键点检测是以人为中心的视频分析中的重要环节，又称为人体姿态估计[1]。人体关键点是具有明确语义的关节点和部位，是行为识别[2]、人机交互[3]和动作捕捉[4]等应用的重要基础。

随着卷积神经网络（Convolutional Neural Network，CNN）的发展，人体关键点检测取得显著进步，精度逐渐提升。基于深度卷积神经网络的人体关键点检测算法分为两类：基于热图表示的检测方法和基于坐标表示的回归方法。

自从Tompson等人[5]首次提出用热图表示关节点，检测方法成为二维姿态估计的主流。孙科等人[6]针对关键点检测任务提出HRNet，整个网络中保持高分辨率的特征图，通过并行连接多个不同分辨率的子网络，并在它们之间进行信息交互和融合，避免了信息的丢失和模糊。检测方法具有精度高、训练效率高和空间泛化性好等优点。但是热图分辨率低于原图分辨率导致的量化误差和解码过程中argmax操作不可微分，使得检测方法依赖高分辨率热图，限制了在嵌入式设备中的使用。

回归方法在人体姿态估计中研究较早，但相关工作较少。回归方法直接端到端产生图像中关键点的坐标。Toshev等人[7]首次提出利用CNN回归坐标进行人体姿态估计。Carreira等人[8]提出了一个迭代误差反馈框架（Iterative Error Feedback，IEF），引入自上而下的反馈，预测当前估算值的偏移量并进行迭代矫正。Nie等人[9]提出了单阶段的多人姿态估计网络（Single-stage Multi-person Pose Machine，SPM），采用根节点预测人体位置，然后预测关节点的偏移量。回归方法拥有简单灵活高效等优点，但性能仍逊色于检测方法，尤其在遮挡、截断和运动模糊等场景中误差较大。

本文详细内容请下载：https://www.chinaaet.com/resource/share/2000005711

作者信息：

王亚东，秦会斌

（杭州电子科技大学新型电子器件与应用研究所，浙江杭州 310018）

微信图片_20210517164139.jpg

原创声明：此内容为AET网站原创，未经授权禁止转载。

相关内容