-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Overview
We're going to work with Sekar Katherisan who has pioneered one of the most important new techniques in genetic risk analysis. It involves applying a simple function to millions of variant calls to determine a simple risk score for a disease area.
For example, his lab reported using the method to evaluate cardiac risk based on 6.6M variants from imputed data sets: http://www.kathiresanlab.org/our-publications/genome-wide-polygenic-scores-for-common-diseases-identify-individuals-with-risk-equivalent-to-monogenic-mutations/
The same technique has been successfully applied to 4 other major diseases: atrial fibrillation, type 2 diabetes, inflammatory bowel disease, and breast cancer:
Literature
Genome-wide polygenic scores for common diseases identify individuals with risk equivalent to monogenic mutations
https://www.nature.com/articles/s41588-018-0183-z
Supplementary material: https://static-content.springer.com/esm/art%3A10.1038%2Fs41588-018-0183-z/MediaObjects/41588_2018_183_MOESM1_ESM.pdf
Modeling Linkage Disequilibrium Increases Accuracy of Polygenic Risk Scores
Describes the LDPred algorithm
https://www.cell.com/ajhg/fulltext/S0002-9297(15)00365-1
Materials and methods show where data is:
https://www.cell.com/ajhg/fulltext/S0002-9297(15)00365-1#secsectitle0160
Projecting the performance of risk score from GWAS studies
Model building algorithm described
https://www.nature.com/articles/ng.2579
Common polygenic variation contributes to risk of schizophrenia and bipolar disorder
The original polygenic risk score paper
https://www.researchgate.net/publication/232772602_Common_polygenic_variation_contributes_to_risk_of_schizophrenia_and_bipolar_disorder
A comprehensive 1000 Genomes–based genome-wide association meta-analysis of coronary artery disease
The Coronary disease GWAS study that provides the GWAS summary stats
https://www.researchgate.net/publication/281643470_A_comprehensive_1000_Genomes-based_genome-wide_association_meta-analysis_of_coronary_artery_disease
Extra data:
http://www.cardiogramplusc4d.org/data-downloads/
A worldwide survey of haplotype variation and linkage disequilibrium in the human genome
Jonathan Pritchard paper widely cited paper on linkage disequilibrium across populations
https://web.stanford.edu/group/pritchardlab/publications/ConradEtAl06a.pdf
Criticism
Polygenic Risk Scores, a Biased Prediction
https://genomemedicine.biomedcentral.com/articles/10.1186/s13073-018-0610-x
Human Demographic History Impacts Genetic Risk Prediction across Diverse Populations
https://www.cell.com/ajhg/pdfExtended/S0002-9297(17)30107-6
PDF - paper plus supplemental materials
India
Our new focus on India means that we will develop a product that is a bit simpler for MVP. Instead of offering the two sided capabilities, we'll offer a core set of reports. The initial thinking is a set of polygenic risk scores + Ancestry that come with the initial purchase and which do not have an author. This gets us off the hook of building out two sided market functionality - including all the complexities of data transfer, authoring tools, payment management, communication tools, validation, ratings, etc - in favor of a much simpler product.
Implementation Thoughts
This technique is not a great fit for our current architecture, however, given that we don't have to offer this as a report-author capability, we can greatly simplify our work by running these risk score calculations at impute-time. We will include the polygenic risk tables in the bioinformatics repo, run the calculations in the python container and store the computed scores for each user in a new table. We can build a simple bespoke report for these scores using plain old React and GraphQL API.