基于改进CenterNet的发票检测算法-AET-电子技术应用

基于改进CenterNet的发票检测算法

电子技术应用

万成凯1，李居朋2

1.北京世纪瑞尔技术股份有限公司；2.北京交通大学电子信息工程学院

摘要： 为了提高发票检测准确性和效率，提出了一种基于CenterNet的发票检测算法。首先，算法模型采用类似CSPDarkNet作为主干网络，将Triplet Attention引入CSP结构中形成TA-CSP结构，主干网络末端引入ASPP以提高网络的感受野范围，使模型能够更好地理解图像的上下文信息；其次，在网络的Neck部分，采用CBAM来引导高低层特征融合，利用高层特征图中语义信息对低层特征图进行监督，以抑制低层特征图中的背景噪声；再次，在网络的Head部分，算法在CenterNet网络的基础上增加4个通道的特征图输出，在发票检测的同时实现发票朝向的预测；最后，在损失函数中增加朝向损失项，以解决发票朝向的优化。在测试数据集上的实验结果表明，本文算法mAP优于CenterNet和YOLOv5s算法达到84.3%，有效提高了发票检测准确率和鲁棒性。

关键词： CenterNet YOLO 目标检测 CBAM 空洞空间卷积池化金字塔

中图分类号：TP391.41；U418.6 文献标志码：A DOI: 10.16157/j.issn.0258-7998.245560
中文引用格式： 万成凯，李居朋. 基于改进CenterNet的发票检测算法[J]. 电子技术应用，2025，51(6)：71-78.
英文引用格式： Wan Chengkai，Li Jupeng. Detection algorithm for invoice based on improved CenterNet[J]. Application of Electronic Technique，2025，51(6)：71-78.

Detection algorithm for invoice based on improved CenterNet

Wan Chengkai1，Li Jupeng2

1.Beijing Century Real Technology Co.， Ltd.； 2.School of Electronic and Information Engineering， Beijing Jiaotong University

Abstract： In order to improve the accuracy and efficiency of invoice detection, a CenterNet based invoice detection algorithm is proposed. Firstly, the algorithm model adopts a backbone network similar to CSPDarkNet, introducing Triplet Attention into the CSP structure to form a TA-CSP structure, and introducing ASPP at the end of the backbone network to improve the receptive field range of the network, enabling the model to better understand the contextual information of the image; Secondly, in the Neck part of the network, CBAM is used to guide the fusion of high-level and low-level features, and the semantic information in high-level feature maps is used to supervise low-level feature maps to suppress background noise in low-level feature maps; Thirdly, in the Head section of the network, the algorithm adds four channels of feature map outputs based on the CenterNet network, achieving invoice orientation prediction while detecting invoices; Finally, an orientation loss term is added to the loss function to optimize the orientation of invoices. The experimental results on the test dataset show that the mAP of the proposed algorithm in this paper is superior to CenterNet and YOLOv5s algorithms reaching 84.3%, effectively improving the accuracy and robustness of invoice detection.

Key words : CenterNet；YOLO；object detection；CBAM；ASPP；Triplet Attention

引言

随着社会的不断发展，大量发票的录入和归档，对于财务人员是相当繁重的工作。在以往的工作中，财务人员往往采用手工录入的方式，这种录入方式不但效率低下，而且常常因为工作人员的疲劳产生错误而造成损失。随着图像处理与深度学习技术的兴起，越来越多的研究人员开始研究基于数字图像技术的发票自动识别算法[1-2]。

基于数字图像技术的发票自动识别通常包括发票检测、发票信息区定位、字符定位、字符识别等步骤。其中首要的步骤就是发票检测。发票检测是检测出一张图像中是否存在发票，并对每张存在的发票进行精确定位。由于在实际的财务归档工作中发票的朝向往往上下左右各不相同，因此发票检测不但要检测出发票的类型、位置，还要同时检测出发票的朝向。

目前基于深度学习的目标检测方法可以分为one-stage和two-stage方法两类。two-stage的方法如Faster R-CNN[3]。这类检测方法整个检测过程分为两个阶段。在第一个阶段，算法需要找到一些可能的目标存在区域；在第二个阶段，算法在这可能的区域上进行分类和位置回归。这类方法检测精度高，但运行速度通常会比较慢，难以满足实时检测的需求。

one-stage的方法如YOLO系列[4-8]、SSD[9]等。这类检测方法是一个端到端的检测过程，它可以直接回归出物体的类别和位置。该类方法过程简洁、检测速度快,目前已被广泛应用于各种目标检测任务当中，但其准确性仍有待提高。

YOLO系列、SSD等检测方法均属于基于锚点（anchor）的方法，需要事先统计anchor尺寸和比例等先验知识，而且在计算过程中，会计算大量无用的候选框。虽然算法后期可以通过非极大值抑制等方法去除多余的候选框，但会带来计算开销的增加。以CenterNet[10]为代表的无anchor检测方法克服了基于anchor方法的缺点，可以直接对目标中心点和尺寸进行预测。

本文结合YOLOv5的主干网络和CenterNet各自的优点，提出了一种改进的CenterNet发票检测算法。算法模型采用参考了CSPDarkNet的主干网络，引入了注意力机制，并且采用新的输出结构和损失函数，可以端到端地检测出发票的分类、位置和朝向。

本文详细内容请下载：

https://www.chinaaet.com/resource/share/2000006565

作者信息：

万成凯1，李居朋2

（1.北京世纪瑞尔技术股份有限公司北京 100085；

2.北京交通大学电子信息工程学院，北京 100044）

Magazine.Subscription.jpg

原创声明：此内容为AET网站原创，未经授权禁止转载。

相关内容