Skip to content

KEP-4381: Define standard device attributes #5316

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
43 changes: 38 additions & 5 deletions keps/sig-node/4381-dra-structured-parameters/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -568,7 +568,7 @@ spec:
version: 11.1.42
powerSavingSupported:
bool: true
dra.k8s.io/pciRoot: # a fictional standardized attribute, not actually part of this KEP
kubernetes.io/pcieRoot:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The attribute name will require API approval.

/assign @liggitt

I believe @liggitt suggested this should be resource.kubernetes.io/pcieRoot - that is, qualify it with resource.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree that there should be some prefix here. 👍 to resource.kubernetes.io/pcieRoot

string: pci-root-0
capacity:
memory: 16Gi
Expand All @@ -578,10 +578,13 @@ Compared to labels, attributes have values of exactly one type. Quantities are d
in the separate `capacity` map. As described later on, both sets can be used in CEL expressions to select a
specific resource for allocation on a node.

To avoid any future conflicts, we reserve any attributes with the ".k8s.io/" domain prefix
for future use and standardization by Kubernetes. This could be used to describe
topology across resources from different vendors, for example, but this is out-
of-scope for now.
We are reserving the `kubernetes.io/` domain (and subdomains) prefix for
attributes and capacities for standardization by the Kubernetes project. This
reservation allows us to define common attributes that can describe hardware
characteristics across resources from different vendors. Currently, we are
defining one such standard attribute: `kubernetes.io/pcieRoot`. Details on its
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Standardizing attributes seems very useful. Could we have a section specifically talk about standardized attributes? For example, could driverVersion, runtimeVersion be standardized, too?

meaning and how it should be exposed by DRA drivers are available in the [API
design section under ResourceSlice's](#resourceslice) QualifiedName definition.

**Note:** If a driver needs to remove a device or change its attributes,
then there is a risk that a claim gets allocated based on the old
Expand All @@ -594,6 +597,19 @@ would allow us to delete the pod and trying again with a new one, but is not don
at the moment because admission checks cannot be retried if a check finds
a transient problem.

**Note:** The immediate motivation for standardizing attributes largely stems
from the current behavior of the `MatchAttribute` constraint, which relies on
exact value matching. While this KEP provides a solution for many cross-driver
alignment needs, a more flexible long-term solution is envisioned with
[KEP-5254: DRA support for MatchExpression
constraints](https://github.com/kubernetes/enhancements/issues/5254) (a
work-in-progress). That proposal aims to introduce a `MatchExpression`
constraint, allowing devices to be evaluated against CEL expressions, which will
enable more complex and dynamic selection criteria. However, solving this
critical alignment problem today is essential for latency-sensitive workloads.
Standardizing attributes helps achieve this without introducing any conflicts
with the future capabilities of KEP-5254.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In any case, even with #5254 (which is now called "Constraints with CEL"), we will need common attributes to do cross-driver comparisons.

### Using structured parameters

A ResourceClaim is a request to allocate one or more devices. Each request in a
Expand Down Expand Up @@ -1212,6 +1228,23 @@ const ResourceSliceMaxAttributesAndCapacitiesPerDevice = 32
// domain prefix are assumed to be part of the driver's domain. Attributes
// or capacities defined by 3rd parties must include the domain prefix.
//
// The Kubernetes project reserves the "kubernetes.io/" domain prefix for
// standardizing attributes and capacities. DRA drivers **SHOULD** use these
// standardized names if they define a characteristic of a device that matches
// the intent of a standard attribute (or capacity) name. This ensures
// consistency and interoperability across different drivers when conveying the
// same idea.
//
// Currently, the following standard attributes have been defined:
//
// 1. `kubernetes.io/pcieRoot`: A string value in the format `pci<domain>:<bus>`,
// referring to a PCIe (Peripheral Component Interconnect Express) Root
// Complex. This attribute can be used to identify devices that share the
// same PCIe Root Complex. DRA drivers MAY determine this value by
// inspecting the hierarchical path of the device's entry in sysfs (e.g.,
// `/sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0`), where the
// `pci<domain>:<bus>` segment at the beginning of the real path identifies
// the Root Complex (e.g., `pci0000:00`).
//
// The maximum length for the DNS subdomain is 63 characters (same as
// for driver names) and the maximum length of the C identifier
Expand Down