Features | Bioframe | polars-bio | PyRanges | Pybedtools | PyGenomics | GenomicRanges |
---|---|---|---|---|---|---|
overlap | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
nearest | ✅ | ✅ | ✅ | |||
cluster | ✅ | |||||
merge | ✅ | |||||
complement | ✅ | |||||
select/slice | ✅ | |||||
coverage | ✅ | |||||
expand | ✅ | |||||
sort | ✅ |
I/O | Bioframe | polars-bio | PyRanges | Pybedtools | PyGenomics | GenomicRanges |
---|---|---|---|---|---|---|
Pandas DataFrame | ✅ | ✅ | ✅ | |||
Polars DataFrame | ✅ | |||||
Polars LazyFrame | ✅ | |||||
Native readers | ✅ |
I/O | Bioframe | polars-bio | PyRanges | Pybedtools | PyGenomics | GenomicRanges |
---|---|---|---|---|---|---|
BED | ✅ | ✅ | ||||
BAM | ||||||
VCF |
Pyranges is multithreaded, but :
- Requires Ray backend plus
nb_cpu: int, default 1
How many cpus to use. Can at most use 1 per chromosome or chromosome/strand tuple.
Will only lead to speedups on large datasets.
- for nearest returns no empty rows if there is no overlap (we follow Bioframe where nulls are returned)