一种多模态隐喻数据集的构建和验证方法

夏 冰; 杨瑞楠; 董 玉; 楚世豪; 唐崇俊; 葛云翔; 尹家斌

doi:10.12146/j.issn.2095-3135.20240124001

一种多模态隐喻数据集的构建和验证方法

A Method for Constructing and Validating a Multimodal Metaphor Dataset

摘要

摘要: 隐喻的目的是启发理解、说服他人。目前, 隐喻呈现文本、图像、视频等多模态融合的趋势, 因此, 识别多模态信息中蕴含的隐喻语义对互联网内容安全具有研究价值。由于缺乏多模态隐喻数据集, 难以建立研究模型, 因此, 当前学者更关注基于文本的隐喻检测。针对这一不足, 作者首先从图像-文本、隐喻出现、情感表达和作者意图等角度构建新型多模态隐喻数据集；其次, 对数据集的标注者进行 Kappa 分数计算；最后, 借助预训练模型和注意力机制融合图像属性特征、图像实体对象特征和文本特征, 构建多模态隐喻检测模型, 验证多模态数据集的质量和价值。实验结果表明：具有情感和意图表达的隐喻数据集可提升隐喻模型检测效果, 多模态信息间相互关系有助于隐喻的理解。

Abstract: Metaphor has the purpose of inspiring understanding and persuading others. Currently, metaphor presents the trend of multimodal integration of text, images, and videos. Therefore, identifying the metaphorical semantics contained in multimodal contents has research value for Internet content security. Due to the lack of multimodal metaphor datasets, it is difficult to establish research models. Therefore, current scholars pay more attention to text-based metaphor detection. To overcome this shortcoming, the paper first generates a new multimodal metaphor dataset from the perspectives of image-text, metaphor appearance, emotion expression, and author intention. Then, Kappa scores were used to assess the consistency among the annotators of the dataset. Finally, a multimodal metaphor detection model is constructed to verify the quality and value of the multimodal data set by combining image attribute features, image entity features, and text features with the help of a pre-training model and attention mechanism. The experimental results show that the metaphor dataset with emotion and intention can improve the effectiveness of metaphor model detection, and confirm that the interrelationship of multimodal information is helpful for understanding metaphor.

HTML全文

参考文献(29)

施引文献

资源附件(0)