基于频域注意力时空卷积网络的步态识别方法-AET-电子技术应用

基于频域注意力时空卷积网络的步态识别方法

《信息技术与网络安全》2020年第6期

赵国顺1,2，方建安1，2，瞿斌杰1，2，Samah A.F.Manssor1，2，孙韶媛1，2

1.东华大学信息科学与技术学院，上海201620； 2.数字化纺织服装技术教育部工程研究中心，上海201620

摘要： 为了解决步态信息冗余多、特征重要性分布不均匀以及步态的时空特征难以学习的问题，提出了基于频域注意力的时空卷积网络进行步态识别。该方法改进了三维卷积网络(C3D)学习时空特征，同时提出了一种频域注意力卷积操作，既减少了冗余计算，注意力的调整又提高了学习效果。网络首先将步态信息划分为五组，然后通过改进的卷积进行时空特征抽取，最后通过Softmax层进行分类。在中科大数据集CASIA dataset B中进行测试，在Bag状态与Coat状态下准确率分别为88.5%、92.8%，分别较传统深度卷积网络(Deep CNN)提升3%左右，同时注意力在网络学习中重新分布，各个角度下的准确率也平均提升2%左右。

关键词： 频域注意力三维卷积步态识别生物特征

中图分类号： U491.1文献标识码： ADOI： 10.19358/j.issn.2096-5133.2020.06.003
引用格式：赵国顺，方建安，瞿斌杰，等. 基于频域注意力时空卷积网络的步态识别方法[J].信息技术与网络安全，2020，39(6)：13-18.

Gait recognition method based on frequency domain attention spatio-temporal convolutional network

Zhao Guoshun1，2，Fang Jianan1，2，Qu Binjie1，2，Samah A.F.Manssor1，2，Sun Shaoyuan1，2

1.School of Information Science and Technology，Donghua University，Shanghai 201620，China； 2.Engineering Research Center of Digitized Textile & Apparel Technology，Ministry of Education，Shanghai 201620，China

Abstract： In order to solve the problems of redundant gait information, uneven distribution of feature importance, and difficulty in learning the spatiotemporal features of the gait, a spatiotemporal convolutional network based on attention in the frequency domain was proposed for gait recognition. In the experiment, the spatial and temporal characteristics of three-dimensional convolutional network(C3D) learning were improved. At the same time, a frequency-domain attention convolution operation was proposed, which not only reduced redundant calculations, but also adjusted the attention and improved the learning effect. The network firstly divides the gait information into five groups, then extracts the spatiotemporal features through improved convolution, and finally classifies them through the Softmax layer. Tested in the CASIA dataset B of the Chinese University of Science and Technology, the accuracy rates in the Bag state and Coat state are 88.5% and 92.8% respectively, which are about 3% higher than traditional deep convolutional networks(Deep CNN). At the same time, attention is redistributed in network learning，the accuracy rate of each angle is increased by about 2% on average.

Key words : frequency domain；attention；3D convolution；gait recognition；biometrics；deep learning

步态特征，通俗来说就是人行走时的姿态外观，具体包括手臂、大腿、小腿等身体轮廓的变化，由于步态的采集不需要与被识别者有物理上的接触，也不需要近距离的接触，因此应用场景比较完善。医学研究表明，每一个人的步态都有自己的形态，具有唯一性，使用步态识别具有一定的安全性，不会导致信息的错误。将步态识别技术应用于当今智能监控领域，可以在多场景下对人员进行监控，防止意外情况发生，也有利于锁定犯罪嫌疑人，节省人力物力。

目前，关于步态识别的方法主要有两种。一种是基于步态模板的方法，主要是通过构建步态特征，比如关节点的位置变化、重心的起伏周期等几何数字特征，将一个人的行走视频序列压缩成一个模板，然后通过匹配待预测行人的步态与模板的相似度进行识别。另一种方法是通过深度学习直接抽取原始图像序列的步态信息，通过深度神经网络学习高维时空信息来匹配行人的步态，这种方法不需要大量精细的特征构建，是一种端到端的识别方法。

虽然基于步态模板的方法取得了一定的准确率，但是这种特征构造方法复杂，而且受角度、环境、穿着变化影响较大，同时这种特征缺失了时空信息的抽取，在精度上具有一定的限制性。深度学习方法是一种端到端的学习方法，鲁棒性强，易于操作，但是由于模型参数巨大，如何保证准确性与实时性成了关键。

本文基于深度学习的方法，改良了三维卷积网络(C3D)的网络结构，提出频域注意力卷积操作，主要通过划分频域空间，引进频域卷积。同时另一个创新主要是注意力机制的引入，这使得网络更加关注不同步态之间的不同，调整步态分布的重要性，提升网络学习效果。经由中科大数据集CASIA dataset B检测，本文方法在跨视角实验和方法对比实验中具有提升。

本文详细内容请下载:http://www.chinaaet.com/resource/share/2000003146

作者信息：

赵国顺1,2，方建安1，2，瞿斌杰1，2，Samah A.F.Manssor1，2，孙韶媛1，2

(1.东华大学信息科学与技术学院，上海201620；

2.数字化纺织服装技术教育部工程研究中心，上海201620)

原创声明：此内容为AET网站原创，未经授权禁止转载。

相关内容