MULTI-TASK LEARNING SPEECH RECOGNITION MODEL OF CIVIL AVIATION RADIOTELEPHONY COMMUNICATION BASED ON CONFORMER MODEL
-
Abstract
The general speech recognition model cannot be effectively applied to the acoustic modeling of civil aviation radiotelephony communication due to its industrial characteristics in terms of pronunciation, diction and communication mode. Aiming at the above issues, this paper proposes an end-to-end multi-task learning speech recognition model of civil aviation radiotelephony communication based on Conformer model. By introducing convolution modules into Transformer model, Conformer model could further enhance local information acquisition while retaining the global information modeling capability of context long-distance dependencies. Meanwhile, the proposed model combined connectionist temporal classification (CTC) with attention-based Encoder-Decoder (AED) model for multi-task learning to further improve its performance. The experimental results demonstrate that the proposed method can effectively take into account both global and local information in acoustic modeling. The character error rate (CER) and sentence error rate (SER) on the land air communication dataset are reduced to 1.98% and 2.89%, respectively.
-
-