Skip to content

nsg-ethz/ixp-traffic-dataset

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 

Repository files navigation

IXP Traffic Dataset

Dataset accompanying:

Five Blind Men and the Internet: Towards an Understanding of Internet Traffic
Ege Cem Kirci, Ayush Mishra, Laurent Vanbever — NINeS 2026

Repository Contents

  • data/profiles/: traffic profiles in Parquet format, named <index>_traffic_profile.parquet
  • data/metadata.csv: mapping table for each profile

Parquet Schema (data/profiles/*.parquet)

Each profile file contains the following columns:

  • src_id (int): source/profile ID. This matches the <index> in the filename <index>_traffic_profile.parquet.
  • human_time (string, format %Y-%m-%d %H:%M:%S): timestamp at 5-minute resolution (UTC-based canonical timeline).
  • data_in (int): inbound traffic rate in bits per second (bps), using decimal/SI scaling.
    • Gbps = data_in / 1e9
    • Tbps = data_in / 1e12

How to Map Profiles to IXPs

Each Parquet profile file is keyed by <index> in its filename.

Example:

  • data/profiles/170_traffic_profile.parquet corresponds to index = 170 in data/metadata.csv

metadata.csv columns:

  • index: profile identifier used in file names
  • ixp_id: IXP identifier from PeeringDB
  • siblings: sibling IXP IDs (serialized list)

Interpretation Rules

  • If siblings is empty ([]), the profile maps one-to-one to a single IXP: ixp_id.
  • If siblings is non-empty, the profile is organization-level aggregated traffic across multiple IXPs.
  • In that aggregated case, the full IXP set is: union({ixp_id}, siblings)

PeeringDB Join

This repository intentionally keeps metadata minimal.

To obtain IXP names, locations, and other attributes, join ixp_id (and any IDs in siblings) against your PeeringDB data source.

Contact

Questions: ekirci@ethz.ch

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors