融合多尺度CNN与Transformer的恶意软件行为检测方法-AET-电子技术应用

融合多尺度CNN与Transformer的恶意软件行为检测方法

网络安全与数据治理

刘帅1,2，王小英1,2，戚盼盼1,2，崔方方1,2，谷瑞泽1,2

1.应急管理大学计算机科学与工程学院； 2.廊坊市网络应急保障与网络安全重点实验室

摘要： 针对恶意软件行为轨迹隐蔽且长序列依赖难以建模的严重威胁，提出一种融合多尺度卷积神经网络与Transformer架构的恶意软件检测方法，此方法首先借助Speakeasy仿真日志去噪及复合事件标记化技术，将冗余日志转化为标准化语义序列，接着运用多层次卷积神经网络结构来提取局部攻击行为特征，在此基础上，将提取的局部攻击行为特征输入Transformer编码器，利用多头自注意力机制建模全局时序依赖关系。实验结果表明，该混合模型在Speakeasy数据集上的准确率和F1Score分别达到9229%和9248%。该方法显著降低了序列检测中的误报率，为复杂网络环境下的恶意软件检测提供了新的技术途径。

关键词： 恶意软件检测卷积神经网络 Transformer 多尺度特征提取动态行为分析

中图分类号：TP3095文献标志码：ADOI:10.19358/j.issn.2097-1788.2026.04.006
中文引用格式：刘帅，王小英，戚盼盼，等. 融合多尺度CNN与Transformer的恶意软件行为检测方法［J］.网络安全与数据治理，2026，45（4）：45-50.
英文引用格式：Liu Shuai，Wang Xiaoying，Qi Panpan，et al. A malware behavior detection method based on the fusion of multiscale CNN and Transformer
［J］.Cyber Security and Data Governance，2026，45（4）：45-50.

A malware behavior detection method based on the fusion of multi-scale CNN and Transformer

Liu Shuai1,2，Wang Xiaoying1,2，Qi Panpan1,2，Cui Fangfang1,2，Gu Ruize1,2

1. College of Computer Science and Engineering, University of Emergency Management; 2. Langfang Key Laboratory of Network Emergency Support and Cybersecurity

Abstract： To address the severe threats posed by stealthy malware behavioral trajectories and the difficulty in modeling longsequence dependencies, this paper proposes a detection method that fuses multiscale Convolutional Neural Networks (CNN) with the Transformer architecture. First, the approach utilizes Speakeasy simulation logs denoising and composite event tokenization techniques to convert redundant logs into standardized semantic sequences. Next, it employs a multilayer CNN structure to extract local attack behavior features. Subsequently, these extracted features are fed into a Transformer encoder to model global temporal dependencies via a multihead selfattention mechanism.The experimental results show that the hybrid model has achieved an accuracy of 9229% and an F1Score of 9248% on the Speakeasy dataset. This approach significantly reduces the false positive rate in sequence detection, providing a new technical pathway for malware detection in complex network environments.

Key words : malware detection; Convolutional Neural Network (CNN); Transformer; multi-scale feature extraction; dynamic behavior analysis

引言

随着互联网技术快速发展，恶意软件数量不断增多，其攻击方式也变得更加隐蔽，这让基于动态行为分析［1］的技术成为检测未知威胁的关键手段。面对维度高且规模大的动态行为日志数据，实现恶意特征的高效提取与精准识别已经成为当前网络安全领域亟需突破的关键技术挑战。

目前学术界大多借助深度学习算法来达成此类任务的自动化处理，然而单一网络架构在处理高维长序列日志时仍存在缺陷，卷积神经网络（CNN）可以有效地借助滑动窗口机制来提取局部行为特征，但受限于卷积核感受野，很难有效地对长程语义关联进行建模，而基于自注意力机制的Transformer以及采用门控结构的LSTM［2］模型虽然在时序依赖关系建模方面有着出色表现，但是在面对高维长序列行为数据时，还是会面临计算并行性不足或者局部特征表征能力欠缺等技术难题。此外，仿真日志数据里普遍存在的随机噪声进一步降低了模型的鲁棒性。

针对上述问题，本文提出一种将多尺度CNN和Transformer［3］结合起来的恶意软件检测方法，该方法先采用基于核心行为的序列去噪以及复合事件标记化策略有效过滤冗余噪声，接着构建混合网络结构，利用多尺度CNN［4］捕捉局部攻击特征，并且借助Transformer机制挖掘全局长程依赖关系，最终实现对恶意代码的高效识别。在Speakeasy数据集上的实验验证了此方法在长序列建模问题方面的出色表现：检测准确率提升至9229%，F1Score提升至9248%，维持了较高的检测精度，又降低了漏报数量，为恶意软件检测［5］提供了一种高效且可靠的技术解决方案。

本文详细内容请下载：

https://www.chinaaet.com/resource/share/2000007058

作者信息：

刘帅1,2，王小英1,2，戚盼盼1,2，崔方方1,2，谷瑞泽1,2

（1.应急管理大学计算机科学与工程学院，河北廊坊065201；

2.廊坊市网络应急保障与网络安全重点实验室，河北廊坊065201）

原创声明：此内容为AET网站原创，未经授权禁止转载。

相关内容