基于BERT位置感知的旅游三元组知识抽取方法

A TRIPLET KNOWLEDGE EXTRACTION METHOD VIA LOCATION-WISE BASED ON BERT IN TOURISM SCENE

  • 摘要: 直接获取文本中的三元组,往往存在语义联系较弱、距离过长和一词多义的问题,因此,提出基于BERT预训练的位置感知的两阶段旅游三元组知识抽取方法。利用BERT-Span模型通过边界预测方法对旅游实体进行识别;利用旅游数据中的字、语义、位置和实体类型特征,构建融合位置感知注意力和头尾实体类型的关系抽取模型。在山西旅游数据集上进行实验,实验结果表明提出的方法优于基准模型的F1值。

     

    Abstract: The directly-acquired texts often have problems such as weak semantic connection, excessive length, and polysemy. Therefore, this paper proposes a two-stage triplet knowledge extraction method via location-wise based on BERT pre-training. The BERT-Span model was used to achieve entity recognition of tourism through boundary prediction. A relationship extraction model combining positional perception attention and head-tail entity type was constructed based on the character, semantics, location, and entity type characteristics. The experimental results on the Shanxi tourism dataset show that the proposed method is superior to benchmark models in the F1 value.

     

/

返回文章
返回