A Jupyter IPython magic extension for measuring I/O operations per second (IOPS) in your code.
While working with large astronomy datasets at LINCC-Frameworks, we kept hitting mysterious performance bottlenecks. Traditional profilers showed CPU time was fine, but our pipelines were crawling. Turns out, we were thrashing the disk with small random I/O operations.
We needed a way to see I/O patterns directly in our Jupyter notebooks without context-switching to system monitoring tools. That's why we built iops-profiler.
You can install iops-profiler directly from PyPI:
pip install iops-profilerOr install from source:
git clone https://github.com/lincc-frameworks/iops-profiler.git
cd iops-profiler
pip install -e .- Debugging slow data pipelines: Identify if you're reading too many small files vs. fewer large ones
- Optimizing database exports: See real-time I/O patterns when writing CSVs or Parquet files
- Benchmarking storage backends: Compare local disk vs. network filesystems
- Understanding pandas/polars operations: Profile what's actually hitting disk during DataFrame operations
📚 Read the full documentation on Read the Docs
The documentation includes:
- Comprehensive user guide
- Example notebooks with hands-on tutorials
- Platform-specific notes (Linux, macOS, Windows)
- Troubleshooting guide
- API reference
Load the extension in your Jupyter notebook:
%load_ext iops_profilerUse the %iops line magic to profile a single line of code:
%iops open('test.txt', 'w').write('Hello World' * 1000)Output:
Execution Time: 0.002s
Read Ops: 0 | Write Ops: 3 | Total: 3
Bytes Read: 0 | Bytes Written: 11000
IOPS: 1500.0 | Throughput: 5.50 MB/s
Use the %%iops cell magic to profile I/O operations in an entire cell:
%%iops
# Your code here
with open('test.txt', 'w') as f:
f.write('Hello World' * 1000)The extension will display a table showing:
- Execution time
- Read/write operation counts
- Bytes read/written
- IOPS (operations per second)
- Throughput (bytes per second)
Check out our example notebooks for hands-on learning:
- Basic Usage - Learn the fundamentals of line and cell magic
- Histogram Visualization - Visualize I/O operation distributions
You can also find the notebook files in the docs/notebooks/ directory.
Use the --histogram flag to visualize I/O operation distributions (available for strace and fs_usage measurement modes):
Example - Analyzing I/O patterns with multiple file sizes:
%%iops --histogram
import tempfile
import os
import shutil
# Create test files with different sizes
test_dir = tempfile.mkdtemp()
try:
# Write files of various sizes to create diverse write operations
# Small writes (few KB)
for i in range(5):
with open(os.path.join(test_dir, f'small_{i}.txt'), 'w') as f:
f.write('x' * 1024) # 1 KB
# Medium writes (tens of KB)
for i in range(3):
with open(os.path.join(test_dir, f'medium_{i}.txt'), 'w') as f:
f.write('y' * (10 * 1024)) # 10 KB
# Large writes (hundreds of KB)
for i in range(2):
with open(os.path.join(test_dir, f'large_{i}.txt'), 'w') as f:
f.write('z' * (100 * 1024)) # 100 KB
# Now read back the files to create diverse read operations
# Small reads
for i in range(5):
with open(os.path.join(test_dir, f'small_{i}.txt'), 'r') as f:
_ = f.read()
# Medium reads
for i in range(3):
with open(os.path.join(test_dir, f'medium_{i}.txt'), 'r') as f:
_ = f.read()
# Large reads
for i in range(2):
with open(os.path.join(test_dir, f'large_{i}.txt'), 'r') as f:
_ = f.read()
finally:
# Cleanup
if os.path.exists(test_dir):
shutil.rmtree(test_dir)This example generates a rich distribution of I/O operations across multiple size ranges, producing histograms like:
When enabled, two histogram charts are displayed alongside the results table:
- Operation Count Distribution: Shows the count of I/O operations bucketed by bytes-per-operation (log scale)
- Total Bytes Distribution: Shows the total bytes transferred bucketed by bytes-per-operation (log scale)
Both charts display separate lines for reads, writes, and all operations combined, making it easy to identify patterns in your code's I/O behavior.
- Linux/Windows: Uses
psutilfor per-process I/O tracking - macOS: Uses
fs_usagewith privilege elevation (requires password prompt)
- Python 3.10+
- IPython/Jupyter
- psutil
- matplotlib (for histogram visualization)
- numpy (for histogram visualization)
Before installing any dependencies or writing code, it's a great idea to create a
virtual environment. LINCC-Frameworks engineers primarily use conda to manage virtual
environments. If you have conda installed locally, you can run the following to
create and activate a new environment.
conda create -n <env_name> python=3.10
conda activate <env_name>Once you have created a new environment, you can install this project for local development using the following commands:
pip install -e '.[dev]'
pre-commit installNotes:
- The install command will install the package in editable mode with all development dependencies
pre-commit installwill initialize pre-commit for this local repository, so that a set of tests will be run prior to completing a local commit. For more information, see the Python Project Template documentation on pre-commit

