Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create a visualization of mutations relative to the protein sequence, its domains, mutation hotspots, etc. #66

Open
malachig opened this issue Mar 4, 2015 · 8 comments

Comments

@malachig
Copy link
Member

malachig commented Mar 4, 2015

Since we have mutations that are point sites or are more generally in certain domains or other large features of the protein. For example, MD Anderson creates these for each gene and they are customized when you click on each mutation:
screen shot 2015-03-04 at 3 13 17 pm

@jmcmichael
Copy link
Contributor

At some point I can definitely create a d3 visualization that can generate these in the client, I just need to know what data I will be getting to generate the figure and how it corresponds to the MD Anderson example.

@malachig
Copy link
Member Author

malachig commented Mar 5, 2015

Yeah, I think we should keep this on the wishlist for now. There is actually a somewhat complex mix of data being represented here. From left to right:

  • A chromosome ideogram image. The position of each cytoband (giemsa staining data) is indicated.
  • The chromosome coordinate of the current gene is indicated on this image
  • Next the mRNA molecule is depicted. Somehow a single reference transcript has been chosen for each gene. Each exon is enumerated and the relative size of each is depicted. The start and end position of the protein open reading frame are indicated.
  • Next the protein sequence is depicted with the relative position of protein domains (these appear to have been manually curated/simplified). These are labelled with amino acid coordinates.
  • Finally the position of the current variant is indicated on the protein sequence.
  • One could also imagine a display that showed the landscape of all mutations on this figure in a lolliplot style.

The raw material for most of the data needed could be obtained from the Ensembl API (or similar). The variant landscape could come from Cosmic, the ICGC data portal, TCGA, etc.

Zach has been working on a similar visualization concept in R using R modules (Biomart?) to get the raw data and ggplot2 to create the image. If we decide to tackle this at some point we should probably get together a group and brainstorm on some of the specifics.

@jmcmichael jmcmichael added this to the Someday Maybe milestone Jun 1, 2015
@kkrysiak
Copy link
Contributor

The GenVisR (Zach's R modules) protein visualization continues to evolve. https://github.com/griffithlab/GenVisR

Alternatively, in many cases we have all the fields to generate a lolliplot via cBioPortal. http://www.cbioportal.org/mutation_mapper.jsp The source code is also available on GitHub under a GNU Affero General Public License V3 https://github.com/cBioPortal/cbioportal

@jmcmichael jmcmichael removed this from the Someday Maybe milestone Aug 17, 2016
@jmcmichael jmcmichael added the v2 label Mar 15, 2018
@jmcmichael jmcmichael removed the v2 label Nov 8, 2018
@kkrysiak
Copy link
Contributor

@kkrysiak
Copy link
Contributor

@kkrysiak
Copy link
Contributor

From discussions with members of the Melbourne Genomics Health Alliance, this is a priority. Partially because we don't have a nice way to dig through variants that we do have. Aliases only work on the quick search, browse and advanced search pages. On a gene page, unless you are searching for an exact variant, you can't get there.

@kkrysiak
Copy link
Contributor

Everyone loves the COSMIC way of filtering...there is a histogram of events. That sort of thing would work for us as well. I love the lolliplot idea but I don't know about implementation. Shahil's visual (lolliplot that replaced variant occurrences with evidence count) was awesome for this sort of thing. But lolliplots don't really work that well for gene-level events.

@malachig
Copy link
Member Author

We have talked about the idea of collaborating with the ProteinPaint team at St. Judes to resolve this issue. Its not done yet, but it could be a way to resolve this...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants