Abstract:
Electrocardiogram (ECG) has been proved to be the most common and effective approach to investigate arrhythmia. The automatic diagnosis algorithm of arrhythmia can be seen as a multi-label classification problem. The vision transformer (ViT) model has a good performance on classification problems. However, when it is directly applied to ECG classification, it will destroy the shape features inside the ECG signal, resulting in lower model accuracy. To this end, a multi-label classification algorithm for arrhythmia based on the GoogleNet-ViT model is proposed. The algorithm used the pre-trained GoogleNet to extract features instead of directly segmenting the ECG signal, and only used a single Transformer Encoder to complete the construction of the global relationship of features, and finally inputted the fully connected layer to complete multi-label classification. 20 409 cases of clinical ECG data were selected for testing. The results show that the average F1 value of the algorithm reached 0.862 3, the average accuracy rate is 97.68%, and the proportion of diagnostic labels that are completely correct is 83.14%. Compared with the ViT model and the conventional CNN network, the proposed algorithm has clear advantages.