动态查询窗口引导的回复关系发现方法
作者:
作者单位:

中国科学院信息工程研究所 北京

作者简介:

通讯作者:

基金项目:

国家重点研发计划项目(2021YFB3100600)

伦理声明:



Dynamic Inquiry Window Guided Reply-to Relation Identification
Author:
Ethical statement:

Affiliation:

Institute of Information Engineering,Chinese Academy of Sciences,Beijing

Funding:

National Key Research and Development Program of China (2021YFB3100600)

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
    摘要:

    在多人群聊会话中,判断群聊历史消息之间的回复关系是对话领域的一项重要任务。现有的相关工作还未关注并解决以下两个数据分布方面的问题:长度较短的消息往往出现的频率更高,而短文本包含的语义信息较少,限制了模型的学习能力;存在回复关系的正样本数量往往远少于负样本数量,导致模型在训练过程中容易出现数据偏斜问题,降低了模型处理正样本的性能。针对这两个问题,该文提出了一个基于预训练语言模型的改进模型,首先通过动态查询窗口建模来缓解短文本相关问题;然后通过位置驱动正例权重优化来应对正样本相关问题。该文在公开数据集上与前人研究工作进行了比对,实验结果表明,该文工作在召回率和F-1指标上分别达到了62.2%和59.4%,比基线模型平均提高了15.7%和8.5%。此外,该文构建了采集自Telegram平台的新数据集,为后续相关研究提供数据支持。

    Abstract:

    In multi-party group conversations, identifying the reply-to relation between historical messages is an important task in the dialogue domain. Despite of previous efforts, two issues with respect to the data distribution still remained: First, short messages with sparse semantics make up a significant portion of the messages, which in turn restricts the learning potential of the models. Second, positive examples with reply-to relations are often much fewer than negative examples, resulting in data skewness during model training and hindering the model''s performance on positive examples. To address these two issues, this paper proposes an improved model based on a pre-trained language model. Our method first mitigates the issue of short messages by developing a dynamic inquiry window that enriches semantic modeling with comprehensive semantics. Then, it tackles the problem of positive example imbalance through position-driven optimization of positive example weights. Experimental results on the public benchmark show that our method improved model achieves a recall of 62.2% and a F-1 score of 59.4%, which are 15.7% and 8.5% higher than the average baseline model, respectively. The paper also constructs a new dataset collected from the Telegram platform, providing data support for future related research.

    参考文献
    相似文献
    引证文献
引用本文

张竞文,崔诗尧,张兴华,等.动态查询窗口引导的回复关系发现方法 [J].集成技术,

Citing format
Jingwen Zhang, Shiyao Cui, Xinghua Zhang, et al. Dynamic Inquiry Window Guided Reply-to Relation Identification[J]. Journal of Integration Technology.

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
历史
  • 收稿日期:2024-01-31
  • 最后修改日期:2024-02-29
  • 录用日期:
  • 在线发布日期: 2024-07-05
  • 出版日期: