Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions about output metrics #26

Closed
lingrongjin opened this issue Nov 5, 2024 · 2 comments
Closed

Questions about output metrics #26

lingrongjin opened this issue Nov 5, 2024 · 2 comments

Comments

@lingrongjin
Copy link

Hi Jim,

Thanks for developing the tool and I found it quite easy to use. I have a few concept-related questions regarding the output metrics for sylph profile that I hope you could clarify.

  1. From my understanding, containment ANI is calculated as the number of k-mers of a reference genome contained in a given sample (i.e., 95% containment ANI means 95% of the k-mers of the reference genome is contained in the sample).
  2. "sequence_abundance" is calculated as the number of reads assigned to each genome divided by the total number of classified reads? I noticed that sequence_abundance sum up to 100 for most of my samples, but I expect there to be some reads that cannot be mapped to the reference genomes.

Please correct me if I'm wrong in any of these concepts and thank you for your help!

@bluenote-1577
Copy link
Owner

Hi @lingrongjin

  1. It isn't as simple as 95% k-mers contained -> 95% ANI. Your general idea is right but there is a formula we use; see the paper's first figure.

  2. That's the right idea, but sylph does not classify reads. You're right, there will be reads that can not be classified. Try using the -u option in this case, which will scale sequence_abundance by the number of "unknown" reads. See [FEATURE REQUESTS] - post here for suggestions/feature requests #6 (comment)

Thanks

@lingrongjin
Copy link
Author

Thanks for the explanation!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants