Abstract:
To address the problems such as low recognition accuracy and the difficulty in entity boundary determination in named entity recognition of shipping cargo mails, this paper proposes a named entity recognition model based on deep learning and rules. Based on the model BiLSTM-CRF (bidirectional long short term memory-conditional random field), the deep learning method added word character level features and engaged in the multi-head attention mechanisms to obtain the long-distance dependence of texts. The rule matching method made corresponding rules according to the characteristics of domain entities to complete the recognition. According to the characteristics of shipping cargo mails, corpus was marked and divided into five categories: cargo name, quantity, loading and discharge port, laycan and commission. A series of comparative experiments were conducted in self-built shipping cargo text corpus. The experimental results show that the F1 value reaches 79.3% in the field of shipping cargo mails entity recognition.