Question about SNP-based p-values: STAARpipeline VS REGENIE

Hi,

I have a question regarding SNP-based association results from STAARpipeline.

I performed GWAS using:

The same genotype data (MAC ≥ 20)

The same phenotype

The same covariates

I ran:

REGENIE (step 1 + step 2)

STAARpipeline_Individual_Analysis.r from STAARpipeline

When applying the conventional genome-wide significance threshold (5 × 10⁻⁸), I observed a large discrepancy:

REGENIE detected 32 significant variants

STAARpipeline detected 25,848 significant variants

Most of the additional variants detected by STAARpipeline are rare variants. Although the significant variants from REGENIE are also mostly rare, STAARpipeline seems to identify many more rare signals.

Summary statistics of p-values from STAARpipeline results (column 6):

min:    3.98e-05
median: 4.93e-05
max:    0.0657


Based on my understanding, STAARpipeline (in the individual SNP analysis) does not apply variant-level weighting in the same way as gene-based STAAR tests. Therefore, I am concerned about whether these additional rare variant signals are reliable.

My questions are:

1. Is it expected that STAARpipeline detects substantially more rare variant signals than REGENIE under the same MAC filter (MAC ≥ 20)?

2. Are these additional rare variant associations likely to be inflated or due to model differences?

3. Would it be more appropriate to restrict interpretation to variants with MAF ≥ 0.01?

Any guidance would be greatly appreciated.

Thank you!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about SNP-based p-values: STAARpipeline VS REGENIE #87

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Question about SNP-based p-values: STAARpipeline VS REGENIE #87

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions