Abstract:
This paper provides corresponding solutions to the main problems in the text classification of court electronic files. We propose a multi-dimensional semantic representation method for court case file to obtain more accurate and comprehensive text feature information. The Gaussian kernel-based kernel extreme learning machine (KELM) learning text classifier was used to get the global optimal solution while greatly improving the training efficiency. The sequence optimization model KOS-ELM based on recursive least squares (RLS) was used to iteratively update the model parameters through new samples. The solutions enabled the classification model to learn online by itself and reduce the dependence on the initial samples. Through comparative experiments, it was proved that the accuracy of the Gaussian kernel-based KELM classification model was 2.66 percentage points and 4.43 percentage points higher than JP2that of the BP network model and LSSVM, but the training time was only 1/6 and 1/10 of the two. The multi-dimensional JPsemantic representation method was used to provide input for the model, and the accuracy rate was 8.84 percentage points and 2.33 percentage points higher than the text vector and word vector representation methods respectively. The RLS-based sequence optimization model KOS-ELM was used to iteratively optimize the weak classifier. After 20 iterations with 4 different types of step-size, the classification accuracy was significantly improved.