Skip to content

Commit

Permalink
docs: initial proposal for OCI artifact registry
Browse files Browse the repository at this point in the history
Partially addresses kubeflow/community#682

Signed-off-by: Ramkumar Chinchani <[email protected]>
  • Loading branch information
rchincha committed Mar 20, 2024
1 parent c9bdfc2 commit 9ed59f7
Showing 1 changed file with 65 additions and 0 deletions.
65 changes: 65 additions & 0 deletions docs/oci-registry.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
# OCI Registry as a Kubeflow Model Registry

## Authors

- Ramkumar Chinchani (Cisco)
- _TBD_

## Maintainers

- Ramkumar Chinchani (Cisco)
- _TBD_

## Motivation

According to the [Kubeflow 2023
survey](https://blog.kubeflow.org/kubeflow-user-survey-2023/), 44% of users
identified Model Registry as one of the big gaps in the user’s ML Lifecycle
missing from the Kubeflow offering.

![Kubeflow survey](diagrams/model-registry-kubeflowsurvey.png "Kubeflow survey")

## Solution Overview

[Open Container Initiative](https://opencontainers.org/) is a sibling (to CNCF)
organization under [The Linux Foundation](https://www.linuxfoundation.org/)
which has the container
[runtime](https://github.com/opencontainers/runtime-spec),
[image](https://github.com/opencontainers/image-spec) and
[distribution](https://github.com/opencontainers/distribution-spec)
specifications under its purvey which are vendor-neutral contracts that the Kubernetes
ecosystem relies on for running, filesystem layout, and pushing and pulling of
container images.

However, recent developments in the OCI, specifically
[_image_](https://github.com/opencontainers/image-spec/releases/tag/v1.1.0) and
[_distribution_](https://github.com/opencontainers/distribution-spec/releases/tag/v1.1.0)
spec **v1.1.0**, have included support for pushing arbitrary artifacts along
with support for relationships between artifacts.

## OCI v1.1.0 Conformant Registries

The following are the highlights about OCI artifact registries.

- Container images: these represent workloads and have been the traditional use case for an OCI conformant registry.

- Artifacts: these represent arbitrary data (ML model data or additional
metadata in this context) that can also be pushed and pulled from an OCI
conformant registry.

- Content-addressable: all data is organized as a Merkle DAG with sha256 hashed
blobs. This bodes well for reproducibility.

- Versioning: apart from the sha256 hash, all data can be tagged with a human-readable version.

- Annotations: there is provision to append arbitrary annotations to any artifact.

- References: an artifact can now be pushed along with a reference to another
artifact (via the `Subject` field) which can be leveraged to address the data
lineage use case.


## References

_TBD_

0 comments on commit 9ed59f7

Please sign in to comment.