Skip to content

updated publisher spec #147

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 10 commits into
base: main
Choose a base branch
from
Draft

updated publisher spec #147

wants to merge 10 commits into from

Conversation

0x73746F66
Copy link

@0x73746F66 0x73746F66 commented May 10, 2025

TLDR

The TEA Publisher OpenAPI specification has inconsistencies with the consumer API
and documentation, causing potential integration issues and confusion for implementers in the upcoming Beta 1 hackathon on May 28th.


  • Extended parameter documentation to be more complete
  • We now have schema examples
  • Version number updated to 0.0.3 to match consumer API

This PR is focussed on the Publisher API spec, that was mostly unchanged since my last contrinution. I noticed it needed significant updates to align with the latest architecture documentation and all the updated that had been given to the consumer API spec (how would consumers consume things that weren't able to be published? This had to be fixed!)

Incompatibilities with consumer spec

  1. Field Naming Inconsistencies:

Consumer: Uses url in artifact format
Publisher: Uses artifact_url in artifact format

  1. Parameter Structure:

Consumer: Uses id-type/id-value for product queries
Publisher: Uses direct filter parameters

  1. Missing Consumer Endpoints:
  • /release/{uuid}/collection for latest collection
  • /release/{uuid}/collections for all collections
  • /artifact/{uuid} for artifact metadata
  1. Schema Differences:

Consumer requires versions field in component schema
Publisher's component doesn't match this requirement

Changes

  1. Terminology Standardization

Before:

  • Used leaf endpoints and schemas in publisher API
  • Inconsistent with consumer API's component/release model

After:

  • Replaced with proper /component and /release endpoints
  • Aligned object model with documentation

Why: Consistent terminology across consumer/publisher APIs prevents integration errors.

  1. TEI URN Integration

Before:

  • Limited TEI support in product endpoints
  • No validation for TEI URN format

After:

  • Added tei_urns array to product schema
  • Implemented URI pattern validation: ^urn:tei:[a-zA-Z0-9]+:[a-zA-Z0-9\\.-]+:.+$

Why: TEI URNs are the primary discovery mechanism according to documentation.

  1. Collection Update Mechanism

Before:

  • Collection update reasons used undocumented enum values
  • No clear versioning mechanism

After:

  • Added proper collection_update_reason schema with values from docs:
    • INITIAL_RELEASE
    • VEX_UPDATED
    • ARTIFACT_UPDATED
    • ARTIFACT_ADDED
    • ARTIFACT_REMOVED
  • Added /collection/{uuid}/{version} endpoint

Why: Matches documented collection versioning requirements.

  1. Artifact Structure

Before:

  • Used inconsistent artifact structure with objects
  • Single checksum value per artifact

After:

  • Aligned with consumer API using formats array
  • Multiple checksums with algorithm type support
  • Consistent properties with consumer API

Why: Allows full interoperability between consumer/publisher systems.

  1. Component/Release Relationship

Before:

  • Flat hierarchy with limited relationships between objects

After:

  • Clear product → component → release → collection hierarchy
  • Proper reference arrays between object types

Why: Matches documented model and enables proper artifact organization.

Signed-off-by: Chris Langton <[email protected]>
- Add authentication schemes
- Create discovery endpoints (/.well-known/tea/{id})
- Standardise property naming between consumer/publisher APIs
- Implement advanced filtering for components and artifacts
- Add lifecycle status endpoints

Signed-off-by: Chris Langton <[email protected]>
@0x73746F66 0x73746F66 marked this pull request as ready for review May 10, 2025 13:16
@0x73746F66 0x73746F66 requested review from oej and madpah as code owners May 10, 2025 13:16
@ppkarwasz
Copy link
Contributor

@0x73746F66,

Thank you for updating the Publisher spec! 💯 It was certainly way of out of date!

However, I notice that the proposed spec has also a lot of subtle differences (schema names, parameter names, schema contents) from the current Consume API. Should we mark this PR as draft and work on it after the hackaton?

@0x73746F66 0x73746F66 marked this pull request as draft May 21, 2025 10:43
Copy link
Contributor

@taleodor taleodor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the Pull Request! Did first read through for the document, there may be more things to discuss later, but raised issues in the comments that require further work and / or discussion.

description: Local development
paths:
/product/{teiUrn}:
patch:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this operation should be done by UUID instead of TEI URN, since UUID is the primary key within the TEA server and this would avoid any uncertainty. UUID can be resolved by the client using identifiers from Consumer API.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For an API that will create a database record, user provided ID are an alias to the Database Primary key.

This design is correct.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which is UUID and not TEA URN.

application/json:
schema:
type: object
properties:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the properties listed here are not currently present in the Consumer API Spec. I suggest we should resolve them in the Consumer API Spec if we need / want to introduce those (i.e., barcode, sku, vendorUuid, and others)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are, because we support PURL, right?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Specifically, the consumer API returns the constructed PURL as a dynamic response property - how would a PURL be described in the publisher API call any other way than it's componenets?
An API spec that implements the intent of spoken word spec documents is an accurate API spec in terms of tenchical correctness

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I specifically mentioned barcode, sku, vendorUuid, why is this relevant to PURL?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I misread your comment, because those are in the markdown I expected you knew they're needed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe those things should be added under existing TeaIdentifier type (which currently makes a list of identifier key and value tuples in the Consumer API, it's just not expanded to all these additional types - that needs to be discussed separately).

post:
description: Create TEA Product entry for the supplied product identifier
operationId: createTeaProduct
requestBody:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to above, most of those properties are not currently part of the Consumer API spec. Additionally, it would be helpful to create a shared Schema object as we did throughout the Consumer API.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, refer to PURL in the consumer API

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, don't see how this is relevant to PURL at all.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're going to have to be clear then, because of you're referring to the markdown spec and find these then perhaps the consumer spec should be updated after beta1 to align like this has

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my previous comment, those are listed in markdown as possible identifiers, we shouldn't have a separate field for each type and instead expand TeaIdentifier. We may also want to be a little stricter about identifier types - it's possible that markdown needs to be updated here as well.

$ref: '#/components/responses/404-object-by-id-not-found'
tags:
- TEA Component
/component/{componentIdentifier}:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest using same name for {uuid} across the spec and refer to the Schema uuid definition as in the Consumer API.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that makes sense is we merge them and the meaning has no distinctions from publisher and consumer. i.e. a publisher does not have the primary key yet (until the database record is inserted) so it cannot be the same entity as the resultant response entity in the consumer API spec.

It is common for request models to be distinct from response models in all API designs - this is why I deveated deliberately

content:
application/json:
schema:
type: object
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for Products, suggest creating Schema entry for Component, similar to Consumer API.

$ref: '#/components/responses/404-object-by-id-not-found'
tags:
- TEA Release
/release/{releaseIdentifier}:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar consideration with {releaseIdentifier} -> {uuid} as for Components.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uuid is system generated primary key, releaseIdentifier is user provided for their system needs

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is why it is important to use primary key here to make sure we're touching the right object.

tags:
- TEA Release
/release/{releaseIdentifier}:
patch:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if we should allow update operations on critical release fields, such as version, date, etc. This may depend on lifecycle (future conversation). I suggest we discuss this before implementing.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, I checked with Olle and CLE is out. when CLE is in again it will be refactored

$ref: '#/components/operations/standardDelete'
tags:
- TEA Release
/collection:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should discuss whether explicit CRUD on Collections should be allowed, see #152

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of the referenced issue has a mixture of obviously good ideas but I believe that dogma might be winning over critical thinking a little here. Without question why these views are better than allowing publishers the flexibilities is problematic, we automatically impose an arbitrary constraint. Perhaps constraints should be sparingly applied to a stnadard like TEA

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me this is actually purely implementation prospective. I think we should add calls related to adding artifacts, then it becomes more clear that having both CRUD on collections and on artifacts conflicts with each other logically. For now, since artifact CRUD is not there it may be less visible.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CRUD is a por description for this spec, given R for read is not apparent and therefore chaining requests that rely on prior call results with new unknown UUIDs can be a logical constraint which only a read can resolve..

Better to reduce chaining requests for creation, and offer incremental limited updater methods expected to be called at much later dates, dependant on having a timely GET request for freshness

Would you agree a single POST with everything is both concise, organised, and performant? With consideration any complications that can be addressed with a GET + PATCH chain should be back ported to the POST to maintain these desirable characteristics and avoid dependency on complicated chaining logic?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense to me, but right now we have collection type update enum, which has things like Vex Updated, Artifact Updated, Artifact Removed, Artifact Added. If you do batch requests where you both add and update artifacts, this concept becomes murky.

Another issue with batch requests is that it eventually would lead to conversation about collection own lifecycle - which to me is more bureaucracy without much gain. I prefer a system where I upload an artifact and it becomes visible right away, instead of a system where I need to first upload an artifact (or a batch) and then approve a new collection on top of that. This is a point to discuss though.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not only "approve" - it's in many cases "sing the collection" too. It's like a commit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should have a conversation on this.

My current implementation idea is that signing should happen automatically on the backend - we can add constraints that artifacts should be signed, etc.

In SaaS scenarios, we can easily have dozens of releases per day for a single component and there can be many components. Each of those releases can have multiple collection versions. I cannot imagine a human signing all of them.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keyless signing (Cosign signature) is definitely a superior choice, however we'll need to support signing with Tuff, CMK, and PGP

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cosign as a service is not a superior choice for a lot of reasons. The software is cool though. Signing is discussed elsewhere and we need to open that discussion.

Signed-off-by: Chris Langton <[email protected]>
@0x73746F66
Copy link
Author

Awesome feedback @taleodor ty. I have some cnversations resolved and others with clarifications t odiscuss

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants