iops-profiler

A Jupyter IPython magic extension for measuring I/O operations per second (IOPS) in your code.

Why?

While working with large astronomy datasets at LINCC-Frameworks, we kept hitting mysterious performance bottlenecks. Traditional profilers showed CPU time was fine, but our pipelines were crawling. Turns out, we were thrashing the disk with small random I/O operations.

We needed a way to see I/O patterns directly in our Jupyter notebooks without context-switching to system monitoring tools. That's why we built iops-profiler.

Installation

You can install iops-profiler directly from PyPI:

pip install iops-profiler

Or install from source:

git clone https://github.com/lincc-frameworks/iops-profiler.git
cd iops-profiler
pip install -e .

Use Cases

Debugging slow data pipelines: Identify if you're reading too many small files vs. fewer large ones
Optimizing database exports: See real-time I/O patterns when writing CSVs or Parquet files
Benchmarking storage backends: Compare local disk vs. network filesystems
Understanding pandas/polars operations: Profile what's actually hitting disk during DataFrame operations

Documentation

📚 Read the full documentation on Read the Docs

The documentation includes:

Comprehensive user guide
Example notebooks with hands-on tutorials
Platform-specific notes (Linux, macOS, Windows)
Troubleshooting guide
API reference

Quick Start

Load the extension in your Jupyter notebook:

%load_ext iops_profiler

Line Magic Mode

Use the %iops line magic to profile a single line of code:

%iops open('test.txt', 'w').write('Hello World' * 1000)

Output:

Execution Time: 0.002s
Read Ops: 0 | Write Ops: 3 | Total: 3
Bytes Read: 0 | Bytes Written: 11000
IOPS: 1500.0 | Throughput: 5.50 MB/s

Cell Magic Mode

Use the %%iops cell magic to profile I/O operations in an entire cell:

%%iops
# Your code here
with open('test.txt', 'w') as f:
    f.write('Hello World' * 1000)

The extension will display a table showing:

Execution time
Read/write operation counts
Bytes read/written
IOPS (operations per second)
Throughput (bytes per second)

Example Notebooks

Check out our example notebooks for hands-on learning:

Basic Usage - Learn the fundamentals of line and cell magic
Histogram Visualization - Visualize I/O operation distributions

You can also find the notebook files in the docs/notebooks/ directory.

Histogram Visualization

Use the --histogram flag to visualize I/O operation distributions (available for strace and fs_usage measurement modes):

Example - Analyzing I/O patterns with multiple file sizes:

%%iops --histogram
import tempfile
import os
import shutil

# Create test files with different sizes
test_dir = tempfile.mkdtemp()

try:
    # Write files of various sizes to create diverse write operations
    # Small writes (few KB)
    for i in range(5):
        with open(os.path.join(test_dir, f'small_{i}.txt'), 'w') as f:
            f.write('x' * 1024)  # 1 KB
    
    # Medium writes (tens of KB)
    for i in range(3):
        with open(os.path.join(test_dir, f'medium_{i}.txt'), 'w') as f:
            f.write('y' * (10 * 1024))  # 10 KB
    
    # Large writes (hundreds of KB)
    for i in range(2):
        with open(os.path.join(test_dir, f'large_{i}.txt'), 'w') as f:
            f.write('z' * (100 * 1024))  # 100 KB
    
    # Now read back the files to create diverse read operations
    # Small reads
    for i in range(5):
        with open(os.path.join(test_dir, f'small_{i}.txt'), 'r') as f:
            _ = f.read()
    
    # Medium reads
    for i in range(3):
        with open(os.path.join(test_dir, f'medium_{i}.txt'), 'r') as f:
            _ = f.read()
    
    # Large reads
    for i in range(2):
        with open(os.path.join(test_dir, f'large_{i}.txt'), 'r') as f:
            _ = f.read()

finally:
    # Cleanup
    if os.path.exists(test_dir):
        shutil.rmtree(test_dir)

This example generates a rich distribution of I/O operations across multiple size ranges, producing histograms like:

When enabled, two histogram charts are displayed alongside the results table:

Operation Count Distribution: Shows the count of I/O operations bucketed by bytes-per-operation (log scale)
Total Bytes Distribution: Shows the total bytes transferred bucketed by bytes-per-operation (log scale)

Both charts display separate lines for reads, writes, and all operations combined, making it easy to identify patterns in your code's I/O behavior.

Platform Support

Linux/Windows: Uses psutil for per-process I/O tracking
macOS: Uses fs_usage with privilege elevation (requires password prompt)

Requirements

Python 3.10+
IPython/Jupyter
psutil
matplotlib (for histogram visualization)
numpy (for histogram visualization)

Dev Guide - Getting Started

Before installing any dependencies or writing code, it's a great idea to create a virtual environment. LINCC-Frameworks engineers primarily use conda to manage virtual environments. If you have conda installed locally, you can run the following to create and activate a new environment.

conda create -n <env_name> python=3.10
conda activate <env_name>

Once you have created a new environment, you can install this project for local development using the following commands:

pip install -e '.[dev]'
pre-commit install

Notes:

The install command will install the package in editable mode with all development dependencies
pre-commit install will initialize pre-commit for this local repository, so that a set of tests will be run prior to completing a local commit. For more information, see the Python Project Template documentation on pre-commit

Name		Name	Last commit message	Last commit date
Latest commit History 115 Commits
.github		.github
docs		docs
images		images
src/iops_profiler		src/iops_profiler
tests		tests
.copier-answers.yml		.copier-answers.yml
.git_archival.txt		.git_archival.txt
.gitattributes		.gitattributes
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.readthedocs.yml		.readthedocs.yml
.setup_dev.sh		.setup_dev.sh
LICENSE		LICENSE
ORGANIZATION.md		ORGANIZATION.md
PUBLISHING.md		PUBLISHING.md
README.md		README.md
READTHEDOCS_SETUP.md		READTHEDOCS_SETUP.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

iops-profiler

Why?

Installation

Use Cases

Documentation

Quick Start

Line Magic Mode

Cell Magic Mode

Example Notebooks

Histogram Visualization

Platform Support

Requirements

Dev Guide - Getting Started

About

Uh oh!

Releases 2

Packages

Contributors 3

Uh oh!

Languages

License

lincc-frameworks/iops-profiler

Folders and files

Latest commit

History

Repository files navigation

iops-profiler

Why?

Installation

Use Cases

Documentation

Quick Start

Line Magic Mode

Cell Magic Mode

Example Notebooks

Histogram Visualization

Platform Support

Requirements

Dev Guide - Getting Started

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Contributors 3

Uh oh!

Languages

Packages