You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for developing the tool and I found it quite easy to use. I have a few concept-related questions regarding the output metrics for sylph profile that I hope you could clarify.
From my understanding, containment ANI is calculated as the number of k-mers of a reference genome contained in a given sample (i.e., 95% containment ANI means 95% of the k-mers of the reference genome is contained in the sample).
"sequence_abundance" is calculated as the number of reads assigned to each genome divided by the total number of classified reads? I noticed that sequence_abundance sum up to 100 for most of my samples, but I expect there to be some reads that cannot be mapped to the reference genomes.
Please correct me if I'm wrong in any of these concepts and thank you for your help!
The text was updated successfully, but these errors were encountered:
It isn't as simple as 95% k-mers contained -> 95% ANI. Your general idea is right but there is a formula we use; see the paper's first figure.
That's the right idea, but sylph does not classify reads. You're right, there will be reads that can not be classified. Try using the -u option in this case, which will scale sequence_abundance by the number of "unknown" reads. See [FEATURE REQUESTS] - post here for suggestions/feature requests #6 (comment)
Hi Jim,
Thanks for developing the tool and I found it quite easy to use. I have a few concept-related questions regarding the output metrics for sylph profile that I hope you could clarify.
Please correct me if I'm wrong in any of these concepts and thank you for your help!
The text was updated successfully, but these errors were encountered: