evals: add oracle rubrics to REST/auth security-critical cases

## Summary

The eval audit identified ~6 HIGH-priority cases in REST and auth eval suites that lack oracle (LLM-as-judge) grading. These are security-critical cases where regex alone can't fully validate correctness.

## Cases needing oracle

From `pygraphistry_rest_eval_ports_v1.json`:
- `rest_auth_env_or_token_no_literals` — security-critical: oracle would ensure no embedded secrets
- `rest_real_endpoints_only` — security-critical: regex can't fully validate endpoint authenticity
- `rest_privacy_and_share_url` — safety-critical: oracle needed for guidance quality

From `pygraphistry_rest_first_principles_v1.json`:
- `fp_password_to_jwt_then_list_files` — security-critical workflow
- `fp_personal_key_to_jwt_exchange` — security-critical
- `fp_single_use_token_gateway_flow` — security flow
- `fp_no_fake_rest_endpoints` — security-critical: oracle would catch hallucinated endpoints

From `pygraphistry_guardrails_v1.json`:
- `auth_env_no_literal_creds` — security-critical: oracle would strengthen
- `privacy_private_not_public` — safety-critical: oracle needed

## Why oracle matters here

These cases validate that the model produces **safe** code — no hardcoded credentials, no hallucinated endpoints, correct privacy modes. Regex checks catch obvious patterns but can miss subtle leakage (e.g., credentials in f-strings, plausible-sounding fake endpoints).

## Approach

Add oracle blocks with rubrics focused on:
- No credential leakage (any form, not just string literals)
- Only real, documented endpoints used
- Correct auth flow ordering
- Privacy mode appropriate for the scenario

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

evals: add oracle rubrics to REST/auth security-critical cases #18

Summary

Cases needing oracle

Why oracle matters here

Approach

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

evals: add oracle rubrics to REST/auth security-critical cases #18

Description

Summary

Cases needing oracle

Why oracle matters here

Approach

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions