Skip to content

Latest commit

 

History

History
59 lines (42 loc) · 3.64 KB

README.md

File metadata and controls

59 lines (42 loc) · 3.64 KB

polars_bio

Features

Genomic ranges operations

Features Bioframe polars-bio PyRanges Pybedtools PyGenomics GenomicRanges
overlap
nearest
cluster
merge
complement
select/slice
coverage
expand
sort

Input/Output

I/O Bioframe polars-bio PyRanges Pybedtools PyGenomics GenomicRanges
Pandas DataFrame
Polars DataFrame
Polars LazyFrame
Native readers

Genomic file format

I/O Bioframe polars-bio PyRanges Pybedtools PyGenomics GenomicRanges
BED
BAM
VCF

Performance

img.png

img.png

img.png

Remarks

Pyranges is multithreaded, but :

  • Requires Ray backend plus
  nb_cpu: int, default 1

            How many cpus to use. Can at most use 1 per chromosome or chromosome/strand tuple.
            Will only lead to speedups on large datasets.
  • for nearest returns no empty rows if there is no overlap (we follow Bioframe where nulls are returned)