基于词项关联的短文本分类研究
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

基金项目:

深圳市知识创新计划基础研究项目(JCYJ20130401170306838)


The Research of Short Texts Classification Based on Association Rules of Lexical Items
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    以短文本为主体的微博等社交媒体,因具备文本短、特征稀疏等特性,使得传统文本分类方法不能够高 精度地对短文本进行分类。针对这一问题,文章提出了基于词项关联的短文本分类方法。首先对训练集进行强关 联规则挖掘,将强关联规则加入到短文本的特征中,提高短文本特征密度,进而提高短文本分类精度。对比实验 表明,该方法一定程度上减缓了短文本特征稀疏特点对分类结果的影响,提高了分类准确率、召回率和 F1 值。

    Abstract:

    Due to its characteristics of shortness and sparseness, short text, as the main body of microblog and other social media, cannot be accurately classified by the traditional text classification methods. To solve this problem, a method of short text classification based on association rules of lexical items was proposed in this paper. Firstly, the training set based on the strong association rules was mined, and then the strong association rules was added to the features of short text so as to increase the feature density of short text, thereby to increase the accuracy of results of short text classification. Comparative experiments show that this method, to some extent, reduces the impact of sparseness of short text on the classification results, and it improves the classification accuracy, recall values and F1 values.

    参考文献
    相似文献
    引证文献
引用本文

引文格式
章 昉,颜华驹,刘明君,等.基于词项关联的短文本分类研究 [J].集成技术,2015,4(3):69-78

Citing format
ZHANG Fang, YAN Huaju, LIU Mingjun, et al. The Research of Short Texts Classification Based on Association Rules of Lexical Items[J]. Journal of Integration Technology,2015,4(3):69-78

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2015-05-29
  • 出版日期:
文章二维码