Add --case-confidence flag for weighted case/control status

## Summary

Add support for **sample-level confidence weights** in case/control assignment, enabling weighted logistic regression and weighted SKAT instead of hard binary phenotype classification.

## Motivation

In multi-cohort analyses, phenotype certainty varies by data source:

| Source | Confidence | Example |
|--------|-----------|---------|
| Confirmed clinical diagnosis | 1.0 | Stoneformers cohort with detailed lab data |
| Retrospective expert curation | 0.7 | GCKD with curated phenotype categories |
| Indication panel inference | 0.4 | AGDE where "ordered stone diagnostics" → inferred stones |

Currently, all cases are treated equally (`case_status = 1`). This forces users to either:
1. Accept phenotype misclassification noise (include uncertain cases at full weight)
2. Exclude uncertain cases entirely (lose statistical power)
3. Run separate tiered analyses (multiplies compute, complicates interpretation)

Weighted case status is more principled — it downweights noisy phenotypes automatically in a single run.

## Current behavior

`compute_phenotype_based_case_control_assignment()` in `helpers.py` returns `tuple[set[str], set[str]]` — pure binary membership. The phenotype vector passed to all tests is `np.ndarray` of 0/1 integers.

## Proposed behavior

### New CLI parameters

```
--case-confidence <method>     Weight method: "equal" (default, current behavior),
                                "count" (fraction of case HPO terms matched),
                                "file" (read from phenotype file column)
--case-confidence-column <col>  Column name in phenotype file for pre-computed weights
                                (used when --case-confidence file)
```

### Pre-computed weights via phenotype file (primary use case)

```csv
LIMS_ID,case_status,case_weight
LB24-1234,1,1.0
LB21-5678,1,0.7
LB24-9012,1,0.4
LB24-0000,0,1.0
```

When `--case-confidence file --case-confidence-column case_weight`:
- Cases retain their binary case/control assignment (for Fisher's and reporting)
- The weight column is loaded as `sample_weights: np.ndarray` and passed to regression-based tests

### HPO match count weights (automatic)

When `--case-confidence count`:
- Weight = `len(sample_hpo_terms ∩ case_hpo_terms) / len(case_hpo_terms)`
- A sample matching 6/8 case HPO terms gets weight 0.75
- A sample matching 1/8 gets weight 0.125

### Integration with statistical tests

| Test | Weight handling |
|------|---------------|
| **Fisher's exact** | Ignores weights (integer counts only) — emit info log |
| **Logistic burden** | `statsmodels.Logit().fit(freq_weights=sample_weights)` |
| **Linear burden** | `statsmodels.WLS(weights=sample_weights)` |
| **SKAT/SKAT-O** | Weights into null model fitting |
| **ACAT** | Works on p-values from weighted tests — no changes needed |

### Default behavior

`--case-confidence equal` (the default) produces `sample_weights = np.ones(n_samples)`, preserving exact backward compatibility.

## Implementation sketch

1. Add `sample_weights: np.ndarray | None` field to `AssociationConfig`
2. Extend `compute_phenotype_based_case_control_assignment()` to optionally return weights
3. Add weight loading from phenotype file column in `phenotype.py`
4. Pass `freq_weights` to `LogisticBurdenTest.run()` and `LinearBurdenTest.run()`
5. Pass weights to SKAT null model
6. Report effective sample size (`sum(weights)^2 / sum(weights^2)`) in diagnostics
7. Fisher's test: log info message that weights are ignored for this test type

## Relationship to other features

- Builds on the v0.15.0 association framework (logistic regression, SKAT, covariates)
- Complements `--restrict-regions` (#84) for multi-cohort workflows
- Orthogonal to variant-level weights (`weights.py`) — this is sample-level

## Use case

Multi-cohort rare variant meta-analysis where phenotype quality varies across cohorts. Users pre-compute confidence scores based on their domain knowledge and provide them in the phenotype file, enabling a single weighted analysis instead of multiple tiered runs.

Test	Weight handling
Fisher's exact	Ignores weights (integer counts only) — emit info log
Logistic burden	`statsmodels.Logit().fit(freq_weights=sample_weights)`
Linear burden	`statsmodels.WLS(weights=sample_weights)`
SKAT/SKAT-O	Weights into null model fitting
ACAT	Works on p-values from weighted tests — no changes needed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add --case-confidence flag for weighted case/control status #85

Summary

Motivation

Current behavior

Proposed behavior

New CLI parameters

Pre-computed weights via phenotype file (primary use case)

HPO match count weights (automatic)

Integration with statistical tests

Default behavior

Implementation sketch

Relationship to other features

Use case

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Source	Confidence	Example
Confirmed clinical diagnosis	1.0	Stoneformers cohort with detailed lab data
Retrospective expert curation	0.7	GCKD with curated phenotype categories
Indication panel inference	0.4	AGDE where "ordered stone diagnostics" → inferred stones

Add --case-confidence flag for weighted case/control status #85

Description

Summary

Motivation

Current behavior

Proposed behavior

New CLI parameters

Pre-computed weights via phenotype file (primary use case)

HPO match count weights (automatic)

Integration with statistical tests

Default behavior

Implementation sketch

Relationship to other features

Use case

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions