Skip to content

Question about SNP-based p-values: STAARpipeline VS REGENIE #87

@Jun-Jun-Xu

Description

@Jun-Jun-Xu

Hi,

I have a question regarding SNP-based association results from STAARpipeline.

I performed GWAS using:

The same genotype data (MAC ≥ 20)

The same phenotype

The same covariates

I ran:

REGENIE (step 1 + step 2)

STAARpipeline_Individual_Analysis.r from STAARpipeline

When applying the conventional genome-wide significance threshold (5 × 10⁻⁸), I observed a large discrepancy:

REGENIE detected 32 significant variants

STAARpipeline detected 25,848 significant variants

Most of the additional variants detected by STAARpipeline are rare variants. Although the significant variants from REGENIE are also mostly rare, STAARpipeline seems to identify many more rare signals.

Summary statistics of p-values from STAARpipeline results (column 6):

min: 3.98e-05
median: 4.93e-05
max: 0.0657

Based on my understanding, STAARpipeline (in the individual SNP analysis) does not apply variant-level weighting in the same way as gene-based STAAR tests. Therefore, I am concerned about whether these additional rare variant signals are reliable.

My questions are:

  1. Is it expected that STAARpipeline detects substantially more rare variant signals than REGENIE under the same MAC filter (MAC ≥ 20)?

  2. Are these additional rare variant associations likely to be inflated or due to model differences?

  3. Would it be more appropriate to restrict interpretation to variants with MAF ≥ 0.01?

Any guidance would be greatly appreciated.

Thank you!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions