Releases: KamilSJaron/smudgeplot
Skylight
Skylight
What's Changed
- Lots of bugfixes in the interface and documentation (@samebdon and @KamilSJaron )
- Add JSON report and PNG default plot format to output by @jgrg in #215
- Fix bug in estimation of errors (#227 by @CormacKinsella in #231)
- New output - .sma file is annotated smudge file, which lists annotated smudge for each pair of coverages, which can be used for...
- New feature - extract k-mers using annotated smudge file and the original k-mer database (by @thegenemyers )
Full Changelog: v0.4.0...v0.5.1
Arched
- local aggregation for subtracting hetmers containing sequencing errors
- fishnet algorithm for 1n coverage fit
Known problems:
- quantification of higher ploidy smudges is not well reflecting higher coverage variance and therefore leads to "overspilling" of kmers pairs to other smudges;
- Sometimes for species with strong tetraploid signal and weak diploid signal, the fit is better for assigning the 4n smudge as 2n. This is a recognisable pattern and clear limitation of the optimisation criteria we use at the moment. We are working on fixing this.
- We removed explicit ploidy predictions, users need to make a join interpretation of smudgeplot, genomescope and prior knowledge about to species to determine the most sensible ploidy. An explicit ploidy level is our preferred solution to both problems listed above, so hopefully this functionality will return.
Oriel
The big changes are
the search for the kmer pair will be within both canonical and non-canonical k-mer sets (Gene demonstrated it makes a difference)
the tool will be supporting FastK kmer counter only
the backend by Gene is paralelized and massively faster
the intermediate file will be a flat file with the 2d histogram with cov1, cov2, freq columns (as opposed to list of coverages of pairs cov1 cov2);
at least for now WE LOSE the ability to extract sequences of the kmers in the pair; this functionality will hopefully restore at some point together with functionality to assess the quality of assembly.
the smudge detection algorithm is under revision and a new version will be released on 18th of October 2024
Double-hung with curtains
- fixed issue with L and U being too close to each other. Smudgeplot simply creates a wild plot of the data that are fed to it regardless of being harder for interpretation (aligned with "honest data reporting" philosophy of smudgeplot, but might cause more confusion, perhpa we will add some more warnings in the future)
Double-Hung
Adding a new feature smudgeplot extract for extracting kmer pairs from a rectangle of the smudgeplot.
Great thanks to @zhenzhenyang-psu !
Documentation: https://github.com/KamilSJaron/smudgeplot/wiki/smudgeplot-extract
For usage see smudgeplot.py extract -h
Still Single Hung
This release is just to get it to Zenodo, otherwise identical to v0.2.2.
Single-Hung
This version updates:
- the annotation algorithm for higher ploidy levels based on simulations.
- encourages using of our KMC that speeds up the search for kmer pairs a lot
- adding a new warnings for mismatching estimates of 1n coverage by different approaches (was silent before)
- change in terminology, instead of "estimated ploidy" we say "proposed ploidy", as there is no model explicitly tested
Single-Hung
- fixed logging (now it's directed to err stream)
- an estimate of ploidy based on all smudges of that ploidy (instead of the ploidy of the brightest smudge)
- smudgeplot interface uses
.pysuffix to meet community standards
Single-Hung
This version is using the same computational backend as the previous version (0.1.3), but it's wrapped in a single interface that is expected to be kept in future:
smudgeplot <task> <arguments>
Further adjustments:
- improved algorithm for placing smudges on the plot for higher ploidy levels than 4
- alternative algorithm for extracting kmers available (
--middleinhetkmerstask)
I had no idea how to name the release, so I have decided to name individual versions of smudgeplots by types of windows, so let's start simple: Single-Hung it is. Hopefully, it will be good enough name to carry all the smudges.