基于 MacBERT 和对抗训练的城市内涝信息识别方法
Urban Waterlogging Information Recognition Method Based on MacBERT and Adversarial Training
-
摘要: BERT 与神经网络模型相结合等方法, 已逐渐应用于获取灾害信息, 但此类方法存在参数量繁多、数据集和微调数据集不一致、局部不稳定等问题。针对上述问题, 该文提出一种基于 MacBERT 和对抗训练的信息识别模型, 该模型利用 MacBERT 预训练模型获得初始向量表示, 再加入些许扰动生成对抗样本, 然后依次输入双向长短期记忆网络和条件随机场。该模型不仅减少了预训练次数和微调阶段差异, 还提高了模型的鲁棒性。实验结果表明, 在微博数据集和 1998 年人民日报数据集上, 基于 MacBERT 和对抗训练的信息识别模型的精确率和 F1 值均有所提升, 性能较其他模型更优, 将该模型用于城市内涝信息识别具有一定的可行性。Abstract: Methods such as BERT and the combination of neural network model have been gradually applied to the acquisition of disaster information. However, such methods have many problems, such as large number of parameters, inconsistent data sets and fine-tuning data sets, and local instability. In this paper, an information recognition model based on MacBERT and adversarial training is proposed. The model obtains the initial vector representation through MacBERT pre-training model, and then adds some perturbations to generate adversarial samples. Then input to the bi-directional long short-term memory and conditional random field in turn, which not only reduces the pre-training times and fine-tuning stage differences, but also improves the robustness of the model. The experimental results show that the information recognition model based on MacBERT and adversarial training are improved the accuracy rate and F1 value on the microblog dataset and the 1998 People’s Daily dataset, and the execution is excellent than other models, which indicates that the model has certain feasibility for urban waterlogging information recognition.