feat(service): add DataRudder provider with safehttp-based ConnectivityProber#37
Open
dy-shimizu wants to merge 8 commits into
Open
feat(service): add DataRudder provider with safehttp-based ConnectivityProber#37dy-shimizu wants to merge 8 commits into
dy-shimizu wants to merge 8 commits into
Conversation
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. 🗂️ Base branches to auto review (1)
Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
Comment |
Captures the T-001 task with ST-001-NN subtasks for the DataRudder provider implementation: scaffolding, BuildInput + auth translation, ConnectivityProber with sentinel UUID probe, and tests. References the approved plan at C:\Users\SHMZ\.claude\plans\giggly-waddling-wind.md. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…vityProber Adds a public DataRudder provider to the catalog that integrates with the Delorean anti-fraud API. Mirrors the Midaz architectural pattern (struct wrapping base.Provider with InputBuilder + ConnectivityProber methods) adapted for OAuth2 token-endpoint auth instead of OIDC. Components: - provider.go: Register() + NESTED providerConfigSchema with additionalProperties:false to reject typos (e.g., token_uri instead of token_url). User-facing display name "DataRudder", description references the Delorean product. - create_transaction.go: datarudder.create-transaction executor with a loose body schema (Delorean validates upstream). - input_builder.go: executorRoutes map (Midaz style) + BuildInput() that normalises trailing slash via strings.TrimRight and delegates auth construction. buildDataRudderAuth extracts the 3 nested auth fields explicitly and hardcodes credentials_location="body" (the field is intentionally not exposed in the provider schema since Delorean only accepts body credentials). - connectivity_prober.go: Probe() against the GET fraud transactions endpoint with a deterministic sentinel UUID. Maps 200/404 to Reached+AuthApplied, 401/403 to Reached without AuthApplied, 5xx to a transport error, and surfaces the upstream "detail" field in ProbeOutcome.Details["upstream_detail"]. SSRF pinning via libSSRF.ResolveAndValidate with lazy env var lookup matching the Midaz pattern. - ssrf_test_helpers.go: //go:build unit helper to reset the SSRF cache between test scenarios, mirroring midaz/ssrf_test_helpers.go. Path /api/v1/fraud/transactions/ is hardcoded as a security measure (capability limiting). Only base_url is user-configurable. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Wires datarudder.Register into RegisterDefaults so end users can discover the DataRudder provider via GET /v1/catalog/providers and create ProviderConfigurations against it. Bumps the catalog provider count assertion from 2 to 3 to reflect the new registration. Without this, the provider package compiles but is invisible to the public catalog API — the entire point of the feature. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Covers the datarudder package with 89.8% statement coverage. - input_builder_test.go: BuildInput happy path, trailing slash normalization (single + multiple slashes), unknown executor ID, missing/empty base_url, auth block omission, body parsing fallback for non-JSON, nodeData passthrough. - connectivity_prober_test.go: 13 test cases covering all outcome paths from the approved plan — sentinel 404 happy path, 200 transaction-exists, 401/403 auth rejection, 5xx server error, token endpoint failure, network unreachable, SSRF blocked, missing base_url, missing auth block, missing auth field, upstream_detail surfacing, non-JSON body resilience, plus AuthRequired() = true. - datarudder_test.go: catalog registration sanity, JSON Schema acceptance/rejection (extra top-level field, extra auth field, each required field missing), and an intra-package integration test that drives the full chain — BuildInput -> runner.Execute -> mock token endpoint -> mock fraud endpoint — asserting Bearer token injection, body integrity, no grant_type sent, and response surfacing. Mock servers run on httptest.Server (127.0.0.1) so tests rely on SSRF_ALLOW_PRIVATE=true + the unit-tagged ssrf_test_helpers. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous assertions used strings.Contains against the raw form body, which would have produced false positives for values like "the-client-foo" matching the expected "the-client". Parse the body via url.ParseQuery and assert on exact decoded values instead, so the test cannot be fooled by substring overlap and any future regression in form encoding is caught. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…tics, test gaps Addresses 1 Critical + 4 High + 5 Medium + 3 Low from the gate-8 multi-agent review, scoped strictly to the DataRudder package (issues outside this package — runner-wide token cache, masking, README drift — are left for the Flowker product team). Probe contract changes (connectivity_prober.go): - CRITICAL: the sentinel UUID 404 (documented success signal) was being misclassified as E2E=Failed by the shared applyProbeOutcome command (which was written for Midaz semantics). Probe now rewrites StatusCode=404 to 200 before returning, and stores the real status in Details["raw_status_code"] for diagnostics. Keeps the shared command mapping intact. - HIGH: removed the post-response switch that overrode AuthApplied=false on 401/403 and added a synthetic TransportErr on 5xx. Status classification is the shared command layer's job; owning that decision in the prober drifts from the Midaz pattern and double-classifies the outcome. - MEDIUM: when applyDataRudderAuth fails, Reached is now false (the probe URL was never reached). The previous "we reached far enough to attempt auth" comment contradicted the documented Reached contract on pkg/executor/connectivity_prober.go:46. - applyDataRudderAuth now returns just error (the bool was redundant — true on success, false on error). Simplifies the call site and removes one dead state variable. - LOW: response body is drained via io.Copy(io.Discard, ...) before Close so keep-alive connection reuse is not blocked by unread bytes past the 4 KiB readUpstreamDetail cap. Test improvements (connectivity_prober_test.go): - HIGH: TestProbe_NetworkError_UnreachableHost was misnamed — config used the same unreachable host for both base_url and token_url, so auth failed FIRST and the post-auth Do() error branch was never covered. Renamed to TestProbe_NetworkError_UnreachableProbeHost and the config now stands up a working token mock + a refused base_url. Now genuinely exercises the intended branch. - MEDIUM: TestProbe_MissingAuthField was renamed in-place and converted to table-driven, covering all three required auth fields (token_url, client_id, client_secret). The previous shape only exercised the client_secret branch because the extractor iterates keys in declaration order. - MEDIUM: added TestReadUpstreamDetail with direct unit coverage of the helper — valid JSON with/without detail key, empty body, empty object, malformed JSON, and a body larger than probeBodyReadLimit (verifies the LimitReader truncation does not panic). - MEDIUM: added TestProbe_Timeout exercising the client-timeout / context- deadline path against a deliberately slow fraud-endpoint mock. The 10-second probeTimeout constant is no longer fully untested. - MEDIUM: URL assertions now use Equal against the deterministic expected URL instead of Contains. - LOW: removed the no-op '_ = ssrfOptions()' re-prime line in resetSSRFCacheForTest (Probe itself primes via the same sync.Once on its next call, so the test-side prime was dead). Added a CAUTION comment warning future contributors against t.Parallel() use given the package-level SSRF cache mutation. - Cascading assertion updates across the existing 401/403/5xx/token-fail tests to honour the new probe contract (AuthApplied=true on a server rejection, Reached=false when auth fails, no synthetic TransportErr from the prober itself). Schema test coverage (datarudder_test.go): - HIGH: TestProviderConfigSchema_RejectsMissingRequiredFields renamed to RejectsInvalidConfigs and expanded to a table-driven enumeration of every required field plus minLength:1 violations. Previously only base_url, auth, and auth.token_url missing were tested; client_id, client_secret, and the empty-string minLength constraint were uncovered. Added an inline note documenting that format:"uri" is annotation-only under Draft 2020-12 (gap is Flowker-team-owned because the compiler config is shared). - HIGH: TestIntegration_CreateTransaction_EndToEnd now captures and asserts that the token endpoint received NO Authorization header. This proves credentials_location="body" is honoured and that a future regression sending both body credentials AND a Basic header would be caught. Coverage rose from 89.8% to 91.6% on the datarudder package. lint clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The SSRF-pinning path replaces the hostname with an IP in PinnedURL, which causes Go's TLS handshake to use the IP as the default ServerName and fail cert verification with "cannot validate certificate for <IP> because it doesn't contain any IP SANs". lib-commons explicitly documents SNIHostname as the field to pass to TLSClientConfig.ServerName for this case. Smoke test against https://inference.api-stg.delorean-ai.com failed 100% of the time without this fix because Delorean's public cert covers *.delorean-ai.com (DNS SANs only). With the fix the handshake validates against the DNS SAN as expected. Adds a regression test that reproduces the production scenario in-process — a self-signed cert with a DNS SAN but no IP SAN, served via httptest TLS, exercised via the IP-host URL form. The "without fix" subtest asserts the exact IP-SAN error; the "with fix" subtest asserts a successful 200. The same defect is latent in pkg/executors/midaz/connectivity_prober.go and the legacy generic path at internal/services/command/test_provider_config_connectivity.go. Out of scope for this PR; tracked as separate followup. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… + list endpoint Phase 2A migration (smoke-tested against https://inference.api-stg.delorean-ai.com). Net -161 lines via three concurrent simplifications: 1. SSRF: libSSRF.ResolveAndValidate + PinnedURL + req.Host=Authority replaced by safehttp.Validate(probeCtx, baseURL, nil). The original URL stays on the wire so TLS SNI works against DNS-only-SAN certs — the smoke-tested fix that withPinnedSNI was working around. 2. SNI workaround: withPinnedSNI helper deleted (45 lines) along with its 4 unit tests. The DNS-only-SAN regression test is preserved as TestSafehttp_HandshakeAgainstDNSOnlyCert, reframed to validate the safehttp.NewClient contract instead. 3. Probe target: sentinel UUID + 404→200 remap replaced by list endpoint /api/v1/fraud/transactions/?limit=1&offset=0 which returns 200 natural. Test setup: ssrf_test_helpers.go deleted; main_test.go added mirroring pkg/executors/midaz/main_test.go for suite-wide safehttp policy override. docs/pre-dev/datarudder/tasks.md updated to reflect the new design. Smoke test (Delorean staging): - overallStatus: passed (572 ms) - stages[2].details.url = .../api/v1/fraud/transactions/?limit=1&offset=0 - all 3 stages (connectivity, authentication, end_to_end) green
e45e7d7 to
d314c5b
Compare
8 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds the DataRudder provider (Delorean fraud-detection API) with:
datarudder.create-transactionexecutor (POST/api/v1/fraud/transactions/)ConnectivityProberimplementation usingsafehttp.Validate+ original URL (no PinnedURL substitution), enabling correct TLS SNI against DNS-only-SAN certsoauth2_token_endpointprovider from feat(service): add oauth2_token_endpoint outbound auth provider #36What's added
pkg/executors/datarudder/provider.go*base.Provider; implementsProvider,InputBuilder,ConnectivityProberpkg/executors/datarudder/create_transaction.go/api/v1/fraud/transactions/pkg/executors/datarudder/input_builder.gocredentials_location: bodyper Delorean contractpkg/executors/datarudder/connectivity_prober.go?limit=1&offset=0) withsafehttp.Validatepre-flightpkg/executors/datarudder/main_test.gosafehttp.SetAllowPrivateForTestfor the suite*_test.gopkg/executors/register.go,pkg/executors/http/http.godocs/pre-dev/datarudder/tasks.mdKey design decisions
1. Probe via list endpoint, not sentinel UUID
Probes
GET {base_url}/api/v1/fraud/transactions/?limit=1&offset=0which returns 200 natural when auth + endpoint reachable. This:?limit=1)https://inference.api-stg.delorean-ai.com: 200, ~570-980 ms wall time, all 3 stages green2. SSRF via
safehttp.Validate+ original URL (notlibSSRF + PinnedURL)Earlier iterations used
libSSRF.ResolveAndValidate + PinnedURL + req.Host = Authority(the pattern Midaz and Tracer probers still use). That pattern silently breaks HTTPS against certs with only DNS SANs because Go derives TLS SNI fromreq.URL.Host— substituting a pinned IP literal there fails handshake withcannot validate certificate for <IP> because it doesn't contain any IP SANs.This was reproduced against Delorean staging (
*.delorean-ai.comcert covers DNS SANs only). Initial workaround was awithPinnedSNI()helper cloning the Transport and settingTLSClientConfig.ServerName. That helper is gone now.Final approach uses
safehttp.Validate(ctx, baseURL, nil)for pre-flight SSRF check, then dispatches the request against the original URL (hostname intact). The command-suppliedhttpClientis already wrapped bysafehttp.NewClientat the command layer, so dial-time IP enforcement is automatic. TLS SNI works correctly without any extra code.Datarudder is the first connectivity prober in the codebase to adopt this pattern. Tracer and Midaz still use the legacy
libSSRF + PinnedURLapproach (they pass tests only becausehttptest.NewTLSServeremits IP SANs that mask the bug). See "Findings reported separately" below.3.
credentials_locationhardcoded tobody(not in schema)The Delorean
/api/v1/auth/loginendpoint only accepts form-body credentials. Surfacingcredentials_locationas a user-configurable field would be misleading. It's set internally ininput_builder.go:buildDataRudderAuth, schema omits it (validated by additionalProperties:false at both schema levels).4.
applyDataRudderAuthreturnsReached=falseon token failureIf the OAuth2 token fetch fails (token endpoint 401, network error, etc.), the probe URL was never hit, so
Reached=false. Diverges from Midaz semantics (which returnsReached=trueif the token host was reachable). Documented and tested.Test plan
make test-unit— PASSgo test -tags=unit ./pkg/executors/datarudder/...— PASS (39 tests/subtests including the 16-case prober suite)make test-integration— PASSmake test-e2e— PASSgofmt,go vet,golangci-lint— clean forpkg/executors/datarudder/https://inference.api-stg.delorean-ai.com:overallStatus=passed, ~570-980 ms wall time,stages[2].details.urlends with?limit=1&offset=0, all 3 stages green. Re-tested after upstream codereview fixes — still green.TestSafehttp_HandshakeAgainstDNSOnlyCert) — exercises the safehttp wrap against a self-signed cert with DNS SAN only, no IP SAN. Locks in that future refactors don't reintroduce the bug.Notes for reviewers
pkg/safehttp(PR feat(infrastructure): defense-in-depth SSRF protection for outbound HTTP #24). If approved, the same migration can be applied to Midaz/Tracer probers in a follow-up.withPinnedSNIhelper was developed, tested, then deleted — git history preserves the journey. Commit442a0a6adds it; commite45e7d7deletes it as part of the safehttp migration.Findings reported separately
During integration, several observations outside this PR's scope were communicated to @alexgarzao:
connectivity-prober-guide.mdcould document thesafehttp.Validateadoption pattern🤖 Generated with Claude Code