基于远程监督的关系抽取数据降噪模型

NOISE REDUCTION MODEL OF RELATION EXTRACTION DATA BASED ON DISTANT SUPERVISION

  • 摘要: 针对远程监督中的错误标注问题,提出一种新的关系抽取模型,模型分为两部分:标签学习器和关系分类器。标签学习器将强化学习的动作与关系标签相对应,通过深度Q网络来探索实例的真实标签,纠正后的标签和句子构成新的数据,减少噪声对模型的影响,并提出K-choice策略来缓解影响和疏漏问题,提高关系抽取的性能。另外,在训练过程中,通过计算单词在关系分类时的贡献值控制触发词,以提高标签预测的准确率。实验结果表明,该模型可以很好地处理噪声,并在句子级别的关系分类上效果良好。

     

    Abstract: Aimed at the problem of error labeling in distant supervision, a new relationship extraction model is proposed. The model was divided into two parts: label learner and relationship classifier. The tag learner corresponded the reinforcement learning action to the relationship tag, explored the real tag of the instance through the deep Q network, and the corrected tag and sentence form new data to reduce the impact of noise on the model. At the same time, K-choice strategy was proposed to alleviate the problem of reward sparsity and improve the performance of relationship extraction. In addition, in the training process, the accuracy of label prediction was improved by calculating the contribution value of words in relation classification and mining trigger words. Experiments show that the model can deal with noise well, and has a good effect on sentence level relationship classification.

     

/

返回文章
返回