Packet Representation Learning for Traffic Classification

Xuying Meng, Yequan Wang, Runxin Ma, Haitong Luo, Xiang Li, Yujun Zhang

八月, 2022

The architecture of PacRep

摘要

With the surging development of information technology, to provide a high quality of network services, there are increasing demands and challenges for network analysis. As all data on the Internet are encapsulated and transferred by network packets, packets are widely used for various network traffic analysis tasks, from application identification to intrusion detection. Considering the choice of features and how to represent them can greatly affect the performance of downstream tasks, it is critical to learn high-quality packet representations. In addition, existing packet-level works ignore packet representations but focus on trying to get good performance with independent analysis of different classification tasks. In the real world, although a packet may have different class labels for different tasks, the packet representation learned from one task can also help understand its complex packet patterns in other tasks, while existing works omit to leverage them. Taking advantage of this potential, in this work, we propose a novel framework to tackle the problem of packet representation learning for various traffic classification tasks. We learn packet representation, preserving both semantic and byte patterns of each packet, and utilize contrastive loss with a sample selector to optimize the learned representations so that similar packets are closer in the latent semantic space. In addition, the representations are further jointly optimized by class labels of multiple tasks with loss of reconstructed representations and of class probabilities. Evaluations demonstrate that the learned packet representation of our proposed framework can outperform the state-of-the-art baseline methods on extensive popular downstream classification tasks by a wide margin in both the close-world and open-world scenario.

类型

会议文章

出版物

In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

Packet Representation PacRep

Yequan Wang

研究员，团队主管

我的研究兴趣包含大模型，具身智能和自然语言处理等。