一种基于特征选择的不平衡数据分类算法
CSTR:
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

基金项目:

浙江省自然科学基金资助项目(LY13F010011);浙江省重大科技专项(2014NM002)


Feature Selection Based Classification Algorithm with Imbalanced Data
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    不平衡数据分类是当前机器学习的研究热点,传统分类算法通常基于数据集平衡状态的前提,不能直接应用于不平衡数据的分类学习。针对不平衡数据分类问题,文章提出一种基于特征选择的改进不平衡分类提升算法,从数据集的不同类型属性来权衡对少数类样本的重要性,筛选出对有效预测分类出少数类样本更意义的属性,同时也起到了约减数据维度的目的。然后结合不平衡分类算法使数据达到平衡状态,最后针对原始算法错分样本权值增长过快问题提出新的改进方案,有效抑制权值的增长速度。实验结果表明,该算法能有效提高不平衡数据的分类性能,尤其是少数类的分类 性能。

    Abstract:

    At present, imbalanced data classification is the research hotspot of machine learning. Traditional machine learning classification algorithms are usually used on balanced datasets, which cannot be directly applied to the imbalanced data. A new kind of imbalanced boosting algorithm based on feature selection was proposed to balance the importance for the minority class samples from different types of attributes of datasets, which not only could select the more meaningful attributes for the prediction of the minority class samples, but also reduce data dimension. Then, the imbalanced boosting algorithm was combined to make the datasets balanced. Finally, since the wrong sample weight of the original algorithm grew fast, a new algorithm which could restrain the growths of sample weight effectively was put forward. Experimental results show that the proposed algorithm can effectively improve the classification performance of imbalanced datasets, especially that of the minority class.

    参考文献
    相似文献
    引证文献
引用本文

引文格式
肖 鹰,吴哲夫,张 彤,等.一种基于特征选择的不平衡数据分类算法 [J].集成技术,2016,5(1):68-74

Citing format
XIAO Ying, WU Zhefu, ZHANG Tong, et al. Feature Selection Based Classification Algorithm with Imbalanced Data[J]. Journal of Integration Technology,2016,5(1):68-74

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2016-02-16
  • 出版日期:
文章二维码