面向金融领域的中文事件关系语料库建设

CONSTRUCTION OF CHINESE FINANCIAL EVENT RELATION CORPUS

  • 摘要: 事件关系识别是信息抽取领域的一项重要任务,也是构建事理图谱的关键。目前面向金融领域的中文事件语料库大多只适用于事件抽取,因此构建一个适用于事件关系识别的语料库具有重要意义。在分析金融新闻文本特点的基础上,制定一套事件及其关系标注规范,借助搭建的标注工具开展语料标注工作,构建一个面向金融领域的中文事件关系语料库。该语料库共计包含文档367篇,事件11 208个,事件关系对196 569个。标注一致性分析以及基于深度学习模型的事件关系识别实验结果表明,该语料库具有较好的可用性。

     

    Abstract: Event relation recognition is not only an important task of information extraction but also the key for constructing event knowledge graphs. However, most event corpora in Chinese financial field are only built for event extraction tasks, therefore, it is significant to build a corpus that focuses on event relation recognition. On the basis of analyzing the characteristics of financial news texts, a set of event and relation labeling specifications was formulated, and we carried out the work of labeling corpus with the annotation tool constructed. A Chinese financial event relation corpus was constructed, which contained a total of 367 annotated documents, 11 208 events, and 196 569 event relation pairs. The analysis of annotation consistency and the experiment results of event relation recognition based on the deep learning models show that the corpus has good usability.

     

/

返回文章
返回