-
Notifications
You must be signed in to change notification settings - Fork 20
Description
Hi,
I have a question regarding SNP-based association results from STAARpipeline.
I performed GWAS using:
The same genotype data (MAC ≥ 20)
The same phenotype
The same covariates
I ran:
REGENIE (step 1 + step 2)
STAARpipeline_Individual_Analysis.r from STAARpipeline
When applying the conventional genome-wide significance threshold (5 × 10⁻⁸), I observed a large discrepancy:
REGENIE detected 32 significant variants
STAARpipeline detected 25,848 significant variants
Most of the additional variants detected by STAARpipeline are rare variants. Although the significant variants from REGENIE are also mostly rare, STAARpipeline seems to identify many more rare signals.
Summary statistics of p-values from STAARpipeline results (column 6):
min: 3.98e-05
median: 4.93e-05
max: 0.0657
Based on my understanding, STAARpipeline (in the individual SNP analysis) does not apply variant-level weighting in the same way as gene-based STAAR tests. Therefore, I am concerned about whether these additional rare variant signals are reliable.
My questions are:
-
Is it expected that STAARpipeline detects substantially more rare variant signals than REGENIE under the same MAC filter (MAC ≥ 20)?
-
Are these additional rare variant associations likely to be inflated or due to model differences?
-
Would it be more appropriate to restrict interpretation to variants with MAF ≥ 0.01?
Any guidance would be greatly appreciated.
Thank you!