Cross Methylome Omnibus (CMO) test

CMO is a gene-level association test that can identify many significant and novel genes ignored by many benchmark methods. Specifically, CMO integrates genetically regulated DNAm in enhancers, promoters, and the gene body to identify additional disease-associated genes. Please cite the following manuscript for CMO analysis:

Wu et al. A gene-level methylome-wide association analysis identifies novel Alzheimer's disease genes. Bioinformatics doi:https://doi.org/10.1093/bioinformatics/btab045

In this repo, we provide the following sources.

CMO: the software for running the CMO test
Codes: all source codes for replicating the results present in the manuscript

Updates

Version 1.0: the preliminary release
Version 2.0: the standard alone release. It should take less than one hour to learn and configure the software.
Version 2.1: fix some bugs and add more comments

Installation

Download and unpackage the CMO package from GitHub. Download through this link or by the following commands:
```
wget https://github.com/ChongWuLab/CMO/releases/download/V2.1/CMO.zip
unzip CMO.zip
cd CMO
```
Download and unzip the constructed weights. Download through this link. Briefly, we generated weights by leveraging the following two resources: GeneHancer and Baselmans, B., et al. (2018).

Typical analysis and output

The CMO analysis takes pre-computed DNA methylation prediction models (included in the CMO package), enhancer-promoter interactions (included in the CMO package), and GWAS summary data to estimate the association between a gene and the trait of interest. We will use the IGAP Alzheimer's summary data (Lambert et al. 2013) as an example to illustrate how to use our methods. This example assumes you have set up the required environment and data, as illustrated in the previous section.

To help users better use our software, we provided a detailed pipeline for running CMO with COVID19-hg GWAS meta-analyses round 5 data. We will provide this pipeline within two weeks.

Input: GWAS summary statistics

We write a wrapping code and try to support all publically available GWAS summary data. Please revise the column names to the commonly used ones (such as BETA, SE, Z, A1, A2, SNP) if you meet some problems. We also calculate Z score vectors automatically if the raw GWAS summary data failed to include it.

Note: The performance of CMO depends on the density of summary-level data. We highly recommend running CMO with raw summary-level data. Pre-process steps such as pruning and restricting to top SNPs may harm the performance.

Performing the CMO

After we prepared the data, we can run CMO via the following single line.

Rscript CMO.R \
--sumstats ./Example/IGAP_chr22.txt \
--out ./Example/ \
--weights_dir /gpfs/research/chongwu/Chong/CMO/
--chr_id 22

This should take around one or two minutes, and we will see some intermediate steps on the screen. If everything works, the results will be saved into ./Example/. Briefly, we save the results of CMO (res_CMO_CHR22.txt) and individual MWAS results (res_MWAS_CHR22.txt).

Output: Gene-disease association

The results are stored in a user-defined output file. For illustration, we explain the meaning of each entry in the first two lines of the output.

Col. num.	Column name	Value	Explanations
2	CHR	22	Chromosome ID
1	geneID	PNPLA3	Feature/gene identifier, taken from gene_list file
2	ensembl	ENSG00000100344	Ensemble ID
3	P0	44319619	Gene start
4	P1	44360368	Gene end
5	n_enhancer	5	Number of enhancers that linked to the gene
6- 11	CpG sites		Information for how many CpG sites and SNPs are used.
12	CMO	0.000122	P value for CMO test
13	Runtime	1.3	Running time for this gene (in second)

Note: We only store the results for genes with external weights. The genes without external weights will be ignored in the output file. We also save the results for standard MWAS. However, we only run MWAS for the CpG sites that are linked to genes.

Command-line parameters

CMO.R

Flag	Usage	Default
--sumstats	Summary statistics (rds file and must have SNP and Z column headers)	Required
--out	Path to output directory	Required
--weights_dir	The weights directory	Required
--chr_id	The chromosome ID. We recommend parallelling the computations by chromosomes	Optional

FAQ

If you have questions, please submit an issue. We will summarize commonly asked questions here.

Fucute updates

The current version (2.1) only supports parallel in chromosomes. It runs smoothly on our own server and takes about ten minutes for a typical run. We will support parallel within each chromosome in the next version to further speed up.
Please send us an email ([email protected]) for any new feature request.

License

Maintainer: Chong Wu ([email protected])

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
CMO		CMO
codes		codes
.DS_Store		.DS_Store
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Cross Methylome Omnibus (CMO) test

Updates

Outline

Installation

Typical analysis and output

Input: GWAS summary statistics

Performing the CMO

Output: Gene-disease association

Command-line parameters

CMO.R

FAQ

Fucute updates

License

About

Releases 3

Packages

Languages

ChongWuLab/CMO

Folders and files

Latest commit

History

Repository files navigation

Cross Methylome Omnibus (CMO) test

Updates

Outline

Installation

Typical analysis and output

Input: GWAS summary statistics

Performing the CMO

Output: Gene-disease association

Command-line parameters

CMO.R

FAQ

Fucute updates

License

About

Resources

Stars

Watchers

Forks

Releases 3

Packages 0

Languages

Packages