Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The manual should explain what is encoded in the headers of output reads. #24

Open
cerebis opened this issue Jun 17, 2023 · 1 comment

Comments

@cerebis
Copy link

cerebis commented Jun 17, 2023

pub fn get_ext_id(&self, read_id: &String) -> String {
let genomes = self
.read_map
.get(read_id)
.map(|genome_set| {
genome_set
.iter()
.map(AsRef::as_ref)
.collect::<Vec<&str>>()
.join(",")
})
.unwrap_or_default();
read_id.clone() + " |" + &genomes
}

The header appears to contain references to mapped genomes, but I do not see this explained in the documentation.

Is it possible to supply a ranking/score for each map? This is perhaps an ignorant question, as I do not know what is returned from searching the gSBT.

@Dreycey
Copy link
Owner

Dreycey commented Jun 17, 2023

@cerebis - This a good point! I will work on ensuring there is documentation describing the headers for the sequences. The goal for adding these was to allow for extracting sequences corresponding to a particular species for downstream assembly if desired. That said, a score could be assigned next to each name corresponding to the degree of overlap. I think this would be a great feature to add - I'm going to add it as an agenda item

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants