Abstract:
Aimed at the problem of efficient detection of malicious URLs, the current detection methods based on blacklist are poor in timeliness and adaptability, and the methods based on traditional machine learning are low in efficiency and accuracy. This paper fully considered the semantic meaning and temporal characteristics of URL, and proposed a hybrid neural network model (CBI_AT). URL was processed from the level of character and word at the same time, for capturing the semantic meaning and temporal features of URL strings effectively. Multi-group attention mechanism was introduced to extract the correlation and dependency between URL data. The experimental results show that the hybrid neural network model can detect malicious URL efficiently, with an accuracy of 99.86% and a F1 score of 99.85%.