基于强化学习的特征工程算法研究-AET-电子技术应用

基于强化学习的特征工程算法研究

2021年电子技术应用第7期

谢斌1，2，林珊玲2，3，林志贤1，2，郭太良1，2

1.福州大学物理与信息工程学院，福建福州350116； 2.中国福建光电信息科学与技术创新实验室，福建福州350116；3.福州大学先进制造学院，福建泉州362200

摘要： 特征工程可以自动地处理和生成那些判别性高的特征，而无需人为的操作。特征工程在机器学习中是不可避免的一环，也是至关重要的一环。提出一种基于强化学习(RL)的方法，将特征工程作为一个马尔可夫决策过程(MDP)，在上限置信区间算法(UCT)的基础上提出一个近似的方法求解二分类数值数据的特征工程问题，来自动获得最佳的变换策略。在5个公开的数据集上验证所提出方法的有效性，FScore平均提高了9.032%，同时与其他用有限元变换进行特征工程的方法进行比较。该方法确实可以得到判别性高的特征，提高模型的学习能力，得到更高的精度。

关键词： 特征工程强化学习机器学习

中图分类号： TP391
文献标识码： A
DOI：10.16157/j.issn.0258-7998.201173
中文引用格式： 谢斌，林珊玲，林志贤，等. 基于强化学习的特征工程算法研究[J].电子技术应用，2021，47(7)：29-32，43.
英文引用格式： Xie Bin，Lin Shanling，Lin Zhixian，et al. Research on feature engineering algorithm based on reinforcement learning[J]. Application of Electronic Technique，2021，47(7)：29-32，43.

Research on feature engineering algorithm based on reinforcement learning

Xie Bin1，2，Lin Shanling2，3，Lin Zhixian1，2，Guo Tailiang1，2

1.School of Physics and Information Engineering，Fuzhou University，Fuzhou 350116，China； 2.China Fujian Optoelectronic Information Science and Technology Innovation Laboratory，Fuzhou 350116，China； 3.School of Advanced Manufacturing, Fuzhou University，Quanzhou 362200，China

Abstract： Feature engineering can automatically process and generate those highly discriminative features without human operation. Feature engineering is an inevitable and crucial part of machine learning. The article proposes a method based on reinforcement learning(RL), taking feature engineering as a Markov decision process(MDP), and proposes an approximate method based on the upper limit confidence interval algorithm(UCT) to solve the feature engineering of binary numerical data problem to automatically obtain the best transformation strategy. The effectiveness of the proposed method is verified on five public data sets. The FScore of the five public data sets is improved by an average of 9.032%. It is also compared with other papers that use finite element transformation for feature engineering. This method can indeed obtain highly discriminative features, improve the learning ability of the model, and obtain higher accuracy.

Key words : feature engineering；reinforcement learning；machine learning

0 引言

机器学习广泛应用于人们的日常生活中，其中预测分析广泛应用于多个领域的决策，包括欺诈检测^[1-2]、在线广告^[3-4]、风险管理、市场营销等。预测模型是采用监督学习算法来进行预测，通过历史数据进行训练分类或者回归模型来预测未知的结果，以起到决策的作用。数据的表示方法对于模型的准确度十分重要，原始的数据空间往往难以表达数据。因此，在模型构建之前对数据进行适当的处理及转换是必不可少的。

特征工程的主要目的就是改变预测建模的特征以更好地适应算法的训练，通过生成那些判别性高的特征来提高模型训练的准确度。在现实中，特征工程是由数据科学家手动和根据领域知识来进行的，这一过程往往是十分繁琐且耗时的^[5]，而且很容易产生错误和偏差。

本文详细内容请下载：http://www.chinaaet.com/resource/share/2000003649。

作者信息：

谢斌1，2，林珊玲2，3，林志贤1，2，郭太良1，2

(1.福州大学物理与信息工程学院，福建福州350116；

2.中国福建光电信息科学与技术创新实验室，福建福州350116；3.福州大学先进制造学院，福建泉州362200)

原创声明：此内容为AET网站原创，未经授权禁止转载。

相关内容