《机电工程》杂志,月刊( 详细... )
中国标准连续出版物号: ISSN 1001-4551 CN 33-1088/TH
主办单位:浙江省机电集团有限公司
浙江大学
主编:陈 晓
副 主 编:唐任仲、罗向阳(执行主编)
总 经 理:罗向阳
出 版:浙江《机电工程》杂志社有限公司
地 址:杭州市上城区延安路95号浙江省机电集团大楼二楼211、212室
电话Tel:+86-571-87041360、87239525
E-mail:meem_contribute@163.com
国外发行:中国国际图书贸易总公司
订阅:全国各地邮局 国外代号:M3135
国内发行:浙江省报刊发行局
邮发代号:32-68
广告发布登记证:杭上市管广发G-001号
在线杂志 |
当前位置: 机电工程 >>在线杂志 |
基于filterwrapper的两步特征变量提取方法*
作者:陈岩,来海锋,王清,王卫伟 日期:2010-06-28/span> 浏览:3797 查看PDF文档
基于filterwrapper的两步特征变量提取方法*
陈岩,来海锋,王清,王卫伟
(杭州电子科技大学 自动化学院,浙江 杭州 310018)
摘要:特征变量选择是高维数据分类问题的核心,主要有过滤法和缠绕法两种特征变量选择方法。针对“过滤法与分类算法相互独立,不利于对分类性能优化,而缠绕法依赖于分类算法,在高维高噪的数据中容易过拟合”这个问题,为了能有效提取特征变量,提出了一种新的特征提取方法,即filterwrapper两步法,先通过有监督奇异值分解方法降维去噪,粗选出一部分备选变量;再应用MonteCarlo决策树策略从备选变量中精选出重要的特征变量。通过以典型的高维高噪数据为例验证了该方法,实验结果表明了上述方法的可行性和有效性。
关键词:有监督奇异值分解;信息增益;决策树;特征提取;分类
中图分类号:TP391.4文献标识码:A文章编号:1001-4551(2010)04-0067-05
Twostep feature selection algorithm based on filter and wrapper
CHEN Yan, LAI Haifeng, WANG Qing, WANG Weiwei
(School of Automation, Hangzhou Dianzi University, Hangzhou 310018, China)
Abstract: Feature selection is an essential step to classification with high dimensional data. Filter method is independent of classifiers, and hence the classification quality irrelevant, wrapper method is a classification supervised method, which is prone to overfitting in high dimensional and high noise data. Aiming at these problems of filter and wrapper methods, a new feature selection approach was proposed, which combined filter and wrapper methods, first supervised singular value decomposition was used to reduce variance dimension and extract domain relevant metafeatures, then MonteCarlo decision trees was used to select important variables. This method is applied to three gene expression profiles datasets, and a better gene subset is obtained in contrast with other typical methods, which shows the feasibility and effectiveness of the methods proposed.
Key words: supervised singular value decomposition(SSVD); information gain; decision tree; feature extraction; classification
友情链接