Abstract:
There are many intrinsic relations and similarities among malicious code variant, and similar malicious families adopt the same or similar block label nomenclature. The existing grayscale image-based visualization of malicious code cannot fully contain malicious attack information. This paper proposes a classification method of malicious code based on block reorganization and dual-channel visualization. It computed the block labels' distribution of each category of family samples, found out the target labels, and reorganized the block data of the malicious code sample. It visualized the reorganized sample as a square matrix BR color image, used Gaussian kernel principal component analysis method to perform feature reduction on the image, and inputted these features into a variety of machine learning classifiers for training and classification. The experimental results on the standard data set show that the classification accuracy rate can reach 97.00% and remains stable. The effectiveness is higher than other malicious code detection algorithms.