This repository contains the implementation of LiM (Less is More), a lightweight network traffic classification approach using NetMatrix representation and an XGBoost classifier. LiM is based on the research paper:
"Less is More: Simplifying Network Traffic Classification Leveraging RFCs"
Nimesha Wickramasinghe, Arash Shaghaghi, Elena Ferrari, Sanjay Jha
Published at WWW Companion '25
Read it on ACM DL | Read it on ArXiv
Encrypted traffic classification is essential for network security, monitoring, and management. However, deep-learning-based methods often introduce unnecessary complexity, making them resource-intensive. LiM provides a lightweight, RFC-compliant tabular representation (NetMatrix) and achieves high classification accuracy with significantly lower computational cost than deep-learning models like ET-BERT and YaTC.
📁 LiM-Network-Traffic-Classification
│── requirements.txt # Required dependencies
│── cstnet-tls1.3_5_packets.csv # Pre-processed NetMatrix representation of the CSTNET-TLS1.3 dataset (10 classes)
│── pcap_to_netmatrix.py # Script to convert custom PCAP files to NetMatrix representation
│── xgboost_classifier.py # XGBoost classifier for network traffic classification
│── README.md # Project documentation
Install the required dependencies using:
pip install -r requirements.txt
To use a custom dataset, follow these steps:
-
Replace your PCAP file directory in the
pcap_to_netmatrix.py
file. -
Run the
pcap_to_netmatrix.py
script with the dataset path:python pcap_to_netmatrix.py
-
The script will process the packets and generate a NetMatrix representation as a CSV file.
To perform network traffic classification using the pre-processed NetMatrix representation, execute:
python xgboost_classifier.py
The script will train and evaluate the XGBoost model and display metrics such as accuracy, precision, recall, and F1-score.
Model | Accuracy | Recall | Precision | F1 Score |
---|---|---|---|---|
LiM (Ours) | 0.942 | 0.942 | 0.943 | 0.942 |
- Expand evaluation to other datasets beyond CSTNET-TLS1.3.
- Extend classification to new network traffic protocols.
- Improve feature selection and representation methods for better performance.
Feel free to fork, contribute, and open issues for improvements! For major changes, please open an issue first to discuss your ideas.
If you find this work useful, please consider citing our paper:
@inproceedings{wickramasinghe2025lim,
author = {Wickramasinghe, Nimesha and Shaghaghi, Arash and Ferrari, Elena and Jha, Sanjay},
title = {Less is More: Simplifying Network Traffic Classification Leveraging RFCs},
year = {2025},
isbn = {9798400713316},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3701716.3715492},
doi = {10.1145/3701716.3715492},
booktitle = {Companion Proceedings of the ACM on Web Conference 2025},
pages = {1398–1401},
numpages = {4},
keywords = {encrypted traffic classification, lim, netmatrix, rfc-compliance},
location = {Sydney NSW, Australia},
series = {WWW '25}
}
This project is licensed under the MIT License.
For questions or suggestions, contact:
- Nimesha Wickramasinghe - [email protected]
- Arash Shaghaghi - [email protected]
Happy coding! 🚀