GitHub - AstroWLAN/Wasp: An efficient architecture for labeling network packets

Abstract 💭

Wasp provides an architecture for efficiently labeling network packets in scenarios where inspecting every packet is impractical due to performance constraints

This research work is based on Yisroel Mirsky’s paper Kitsune and its corresponding repo

The following is an overview of the project structure

Wasp/
├── ANN             # KitNET implementation
├── Attacks         # Benchmark datasets and results
├── Kitsune         # Slightly modified version of the original Kitsune
├── ResearchTools   # Architecture implementations and supporting scripts
└── Resources       # Miscellaneous resources

Datasets 💾

The datasets used in our research are available at Kaggle
Since the datasets have different structures a pre-processing step is required to standardize the files for the simulation :

1️⃣ Run sanitizer.py to standardize the true labels .csv file

In most cases the labels are located in column index 1

To visualize the distribution of the ground-truth labels run plotter.py and provide the .csv file generated in the previous step

Make sure to exclude the packets used for neural network training

Simulation 🔬

Run simulation.py to launch the simulation interface : a menu with multiple options will appear
Follow the steps below in the exact order to replicate our results

Each step corresponds to a menu option that must be selected

1️⃣ KitNET
Executes the vanilla KitNET model to generate an array of predictions. You’ll be prompted to provide the path to a .pcap file for analysis and a .csv file containing the ground-truth labels

In order to ensure that the number of generated predictions matches the number of pre-computed labels

2️⃣ Architecture Benchmarks
Runs benchmarks for the two architectures : Naive Sampling and Wasp Detection. You’ll need to provide the .csv file with ground-truth labels and the predictions file generated by KitNET in the previous step

Each system is evaluated across multiple sampling rates. For every rate 300 experiments are conducted to account for the probabilistic nature of the process, gather sufficient data, reduce sample variance and ensure statistical reliability.
The results are then averaged

After completion multiple graphs are generated to visualize the benchmark results

Name		Name	Last commit message	Last commit date
Latest commit History 47 Commits
ANN		ANN
Attacks		Attacks
Kitsune		Kitsune
ResearchTools		ResearchTools
Resources		Resources
.gitignore		.gitignore
README.md		README.md
simultation.py		simultation.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Abstract 💭

Datasets 💾

Simulation 🔬

About

Languages

AstroWLAN/Wasp

Folders and files

Latest commit

History

Repository files navigation

Abstract 💭

Datasets 💾

Simulation 🔬

About

Topics

Resources

Stars

Watchers

Forks

Languages