This is a brief description of our data cleaning process at the back-end. The intermediate dataset generated by this workflow will be the input files of our next step in the pipeline: LD pruning.
CAD - CARDIoGRAM plus C4d 1000G based GWAS (additive)
Metabolite profiling was by NMR (Chenomx) (ChenomxID was mapped to KEGG id)
./merge_gwas_assoc.sh
Currently the P-value threshold was set to 10^-5.
./filter_data_pval.pl <association_file> <Pvalue_cutoff> <cad | serum | urine>
The output file from this step will be formatted to be suitable for the LD pruning process.
./get_intersect_snps.pl <trait_assoc_file> <metab_assoc_file> <serum | urine>