Skip to content

orca: limited s3 compatibility#172

Merged
plombardi89 merged 4 commits into
mainfrom
phlombar/orca-s3-xml-errors
May 21, 2026
Merged

orca: limited s3 compatibility#172
plombardi89 merged 4 commits into
mainfrom
phlombar/orca-s3-xml-errors

Conversation

@plombardi89
Copy link
Copy Markdown
Collaborator

@plombardi89 plombardi89 commented May 21, 2026

Summary

Replaces the EdgeHandler's plain-text error responses with the standard AWS S3 <Error> envelope so SDKs (aws-sdk-go-v2, boto3, MinIO client) surface a typed error code to callers. HEAD requests get status code + headers only, matching real S3 behavior.

Also fixes a routing doc/code mismatch in EdgeHandler.ServeHTTP: the comment claimed GET / returned 405 but the code returned 501 and labelled it ListObjectsV2 regardless of whether a bucket was present.

Error mapping

Internal Status S3 Code
origin.ErrNotFound 404 NoSuchKey
origin.ErrAuth 502 OriginUnauthorized (see below)
*UnsupportedBlobTypeError 502 OriginUnsupported
*OriginETagChangedError 502 OriginETagChanged
*MissingETagError 502 OriginMissingETag
default origin error 502 OriginUnreachable
range failures 416 InvalidRange
GET / 501 NotImplemented (ListBuckets)
GET /{bucket}/ 501 NotImplemented (ListObjectsV2)
HEAD /{bucket}/ 501 NotImplemented (HeadBucket)
unsupported method 405 MethodNotAllowed
auth stub 401 AccessDenied

Why ErrAuth -> 502 (not 403)

A 401/403 from the upstream origin means orca's own credentials were rejected by the origin, not the client's. Returning 403 to the client would falsely imply the client should rotate its own credentials, which would not fix anything. 502 BadGateway communicates that orca cannot satisfy the request through no fault of the client; the orca-specific OriginUnauthorized code lets operators with access to orca logs tell this case apart from a generic origin failure. This rationale is documented in the code.

Routing split

Previously GET / and GET /{bucket}/ both fell into the same branch that returned 501 with the message "ListObjectsV2 not implemented in MVP", which was both inaccurate for GET / and contradicted the in-code routing comment. They are now split:

  • GET / -> 501 NotImplemented ("ListBuckets is not implemented by orca.")
  • GET /{bucket}/ -> 501 NotImplemented ("ListObjectsV2 is not implemented by orca.")
  • HEAD /{bucket}/ -> 501 NotImplemented ("HeadBucket is not implemented by orca.")

The routing comment has been updated to match the actual behavior.

Scope

XML errors apply to the EdgeHandler only. The InternalHandler is the peer-to-peer surface between orca replicas and is never consumed by an S3 SDK; its current plain-text/JSON responses are unchanged.

Testing

  • New unit tests for writeS3Error (GET path, HEAD body suppression, omitted-empty optional fields, nil request).
  • TestWriteOriginError rewritten to assert XML <Code> rather than substring matching.
  • New routing tests: GET /, GET /{bucket}/, HEAD /{bucket}/, unsupported method, auth-enabled.
  • New integration tests (error_test.go): on-the-wire end-to-end checks for NoSuchKey XML, HEAD-no-body on 404, InvalidRange XML, and ListObjectsV2 501.
  • Existing chunk-error test extended to assert XML Code.
  • All unit, race, and integration tests pass locally.

Replaces the EdgeHandler's plain-text error responses with the
standard AWS S3 <Error> envelope so SDKs (aws-sdk-go-v2, boto3,
MinIO client) surface a typed error code to callers. HEAD requests
get status code + headers only, matching real S3 behavior.

Mapping:
  origin.ErrNotFound       -> 404 NoSuchKey
  origin.ErrAuth           -> 502 OriginUnauthorized (see comment)
  UnsupportedBlobTypeError -> 502 OriginUnsupported
  OriginETagChangedError   -> 502 OriginETagChanged
  MissingETagError         -> 502 OriginMissingETag
  default                  -> 502 OriginUnreachable
  range failures           -> 416 InvalidRange
  ListBuckets/ListObjectsV2/HeadBucket -> 501 NotImplemented
  unsupported method       -> 405 MethodNotAllowed
  auth stub                -> 401 AccessDenied

ErrAuth maps to 502 rather than 403: a 401/403 from the upstream
origin means orca's own credentials were rejected, not the client's,
so 403 would mislead the client into rotating credentials that are
not at fault. The OriginUnauthorized code lets operators tell the
case apart from a generic origin failure.

Also fixes a routing doc/code mismatch in EdgeHandler.ServeHTTP: the
comment claimed GET / returned 405 but the code returned 501 and
labelled it ListObjectsV2 regardless of whether a bucket was
present. Splits the GET branch so GET / -> ListBuckets and
GET /{bucket}/ -> ListObjectsV2 (both 501) and updates the comment.

The InternalHandler is intentionally left unchanged; it is the
peer-to-peer surface between orca replicas and is never consumed by
an S3 SDK.
@plombardi89 plombardi89 requested a review from a team May 21, 2026 00:41
…ster

Adds TestS3SDK in internal/orca/inttest/s3sdk_test.go: a suite of
aws-sdk-go-v2 calls against orca's edge listener that proves the SDK
unmarshals orca's S3 <Error> envelope into typed errors
(*s3types.NoSuchKey, *s3types.NotFound) and surfaces the Code via
smithy.APIError for codes without typed shapes (InvalidRange,
NotImplemented, MethodNotAllowed). This is the headline contract
test for the XML compatibility work: TestS3Errors verified the bytes
on the wire are correct, this suite verifies real S3 client code
can consume them.

Subtests covered:
  - GetObject success (positive control)
  - GetObject missing key -> *s3types.NoSuchKey
  - HeadObject missing key -> *s3types.NotFound (HEAD-no-body
    contract: SDK relies on status code alone)
  - GetObject out-of-range -> smithy.APIError Code=InvalidRange
  - GetObject byte range -> 206 + correct body slice + ContentRange
  - ListObjectsV2 -> smithy.APIError Code=NotImplemented
  - PutObject -> smithy.APIError Code=MethodNotAllowed

Also restructures TestS3Errors into a single parent with parallel
subtests sharing one cluster, saving ~18s of repeated cluster
startup. Both files use t.Cleanup rather than defer cancel because
parallel subtests are paused until the parent returns; a defer
would cancel the context before the subtests resume.
TestS3SDK previously covered HeadObject only for the missing-key
typed-error case. Adds HeadObject_Size_TypedFields verifying that a
HEAD against a present object surfaces the correct size via
aws-sdk-go-v2's typed *out.ContentLength field plus a non-empty ETag.

This closes a gap where neither integration nor SDK suites checked
that prefetchers and range planners (the dominant users of HEAD)
see a correct object size. The two existing inttest HEAD calls in
e2e_test.go only inspect the ETag header.
@plombardi89 plombardi89 changed the title orca: return S3-compatible XML error responses + fix root routing comment orca: limited s3 compatibility May 21, 2026
@plombardi89 plombardi89 enabled auto-merge (squash) May 21, 2026 20:39
@plombardi89 plombardi89 merged commit 990e2cd into main May 21, 2026
23 checks passed
@plombardi89 plombardi89 deleted the phlombar/orca-s3-xml-errors branch May 21, 2026 20:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants