-
Notifications
You must be signed in to change notification settings - Fork 34
[CEP 26] Identifying Packages and Channels in the conda Ecosystem #116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 56 commits
5672f6e
b91b868
64584b5
563d9f4
8d85088
3df5157
732cf77
ff68855
e1c043f
90c4867
1b9f46b
809aa05
cb227fc
bcd3e31
b2d733a
3a09b36
10c9265
358b48f
dd0a28d
8f2221b
1185921
5e3c86c
567ff85
4b3a39e
e81f84f
b07d334
42c33dd
c2b25a6
88d58dc
501bfc1
ba6b09f
e9e6c06
e79f5bf
73c8607
a59522f
a6c119a
c473235
5291902
d93720b
a54e326
3961b88
9686874
21a8e3a
8b6e3c3
913279e
f2d5480
1a70702
3793b1d
14aae11
3f7c94f
80a1225
162c57c
4737ce9
de486d2
de7a36f
15bcd90
e9afe15
7eb2fbc
74f7156
897a9a2
bcf6112
7ae6b5b
858fb6f
a9d809d
edfbe6e
9f824e7
f629100
0794754
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| @@ -0,0 +1,226 @@ | ||||||||||||||||||
| # CEP XXXX - Identifying Packages and Channels in the conda Ecosystem | ||||||||||||||||||
|
|
||||||||||||||||||
| <table> | ||||||||||||||||||
| <tr><td> Title </td><td> CEP XXXX - Identifying Packages and Channels in the conda Ecosystem </td> | ||||||||||||||||||
| <tr><td> Status </td><td> Draft </td></tr> | ||||||||||||||||||
| <tr><td> Author(s) </td><td> | ||||||||||||||||||
| Jaime Rodríguez-Guerra <jaime.rogue@gmail.com> <br /> | ||||||||||||||||||
| Matthew R. Becker <becker.mr@gmail.com> <br /> | ||||||||||||||||||
| Cheng H. Lee <clee@anaconda.com> | ||||||||||||||||||
| </td></tr> | ||||||||||||||||||
| <tr><td> Created </td><td> Mar 11, 2025</td></tr> | ||||||||||||||||||
| <tr><td> Updated </td><td> Apr 1, 2025</td></tr> | ||||||||||||||||||
| <tr><td> Discussion </td><td> https://github.com/conda/ceps/pull/116 </td></tr> | ||||||||||||||||||
| <tr><td> Implementation </td><td> N/A </td></tr> | ||||||||||||||||||
| </table> | ||||||||||||||||||
|
|
||||||||||||||||||
| ## Abstract | ||||||||||||||||||
|
|
||||||||||||||||||
| This CEP aims to standardize names and other strings used to identify packages, artifacts and | ||||||||||||||||||
| channels in the conda ecosystem. | ||||||||||||||||||
|
|
||||||||||||||||||
| ## Specification | ||||||||||||||||||
|
|
||||||||||||||||||
| The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", | ||||||||||||||||||
| "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as | ||||||||||||||||||
| described in [RFC2119][RFC2119] when, and only when, they appear in all capitals, as shown here. | ||||||||||||||||||
|
|
||||||||||||||||||
| More specifically, violations of a MUST or MUST NOT rule MUST result in an error. Violations of the | ||||||||||||||||||
| rules specified by any of the other all-capital terms MAY result in a warning, at discretion of the | ||||||||||||||||||
| implementation. | ||||||||||||||||||
|
|
||||||||||||||||||
| ### Identifying package artifacts | ||||||||||||||||||
|
|
||||||||||||||||||
| The conda ecosystem distinguishes between two types of packages: | ||||||||||||||||||
|
|
||||||||||||||||||
| - Distributable package names: represented by a concrete, downloadable, extractable conda artifact. | ||||||||||||||||||
| - Virtual package names: not backed by any concrete artifact. They only exist on the client side. | ||||||||||||||||||
|
|
||||||||||||||||||
| #### Package names | ||||||||||||||||||
|
|
||||||||||||||||||
| A distributable package name MUST only consist of lowercase ASCII letters, numbers, hyphens, | ||||||||||||||||||
| periods and underscores. It MUST start with a letter, a number, or a single underscore. It MUST NOT | ||||||||||||||||||
| include two consecutive separators (hyphen, period, underscore). | ||||||||||||||||||
|
|
||||||||||||||||||
| Virtual package names MUST only consist of lowercase ASCII letters, numbers, hyphens, periods and | ||||||||||||||||||
| underscores. They MUST NOT use two consecutive separators, with one exception: they MUST start with | ||||||||||||||||||
| two underscores. | ||||||||||||||||||
|
|
||||||||||||||||||
| Distributable package names MUST match the following case-insensitive regex: | ||||||||||||||||||
| `^(([a-z0-9])|([a-z0-9_](?!_)))[._-]?([a-z0-9]+(\.|-|_|$))*$`. | ||||||||||||||||||
|
|
||||||||||||||||||
| Virtual package names MUST follow this regex: `^__[a-z0-9][._-]?([a-z0-9]+(\.|-|_|$))*$`. | ||||||||||||||||||
|
Comment on lines
+49
to
+
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I realize it's more of a "taste" thing, but the expressions might be easier to grasp without negative lookaheads, e.g.,
Suggested change
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We want to disallow consecutive separators in the string, like
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe like this? ^([a-z0-9]+|(_[a-z0-9]+))[._-]?([a-z0-9]+(\.|-|_|$))*$
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I've found these things are tricky and the negative lookahead was all that I could get to work when I was working on similar things. |
||||||||||||||||||
|
|
||||||||||||||||||
| In all cases, the maximum length of a package name MUST NOT exceed 64 characters. | ||||||||||||||||||
|
|
||||||||||||||||||
| #### Version strings | ||||||||||||||||||
|
|
||||||||||||||||||
| Version strings MUST only consist of digits, periods, lowercase ASCII letters, underscores, plus | ||||||||||||||||||
| symbols, and exclamation marks. Additional rules apply but are out of scope in this CEP and will be | ||||||||||||||||||
| discussed separately. | ||||||||||||||||||
|
|
||||||||||||||||||
| The maximum length of a version string MUST NOT exceed 64 characters. | ||||||||||||||||||
|
|
||||||||||||||||||
| #### Build strings | ||||||||||||||||||
|
|
||||||||||||||||||
| Builds strings MUST only consist of ASCII letters, numbers, periods, plus symbols, and underscores. | ||||||||||||||||||
| They MUST match this regex `^[a-zA-Z0-9_\.+]+$`. | ||||||||||||||||||
|
|
||||||||||||||||||
| The maximum length of a build string MUST NOT exceed 64 characters. | ||||||||||||||||||
|
|
||||||||||||||||||
| #### Artifact extensions | ||||||||||||||||||
|
|
||||||||||||||||||
| Artifact extensions MUST only consist of lowercase ASCII letters, numbers and periods. They must | ||||||||||||||||||
| start and end with a letter or a number. They MUST NOT include two consecutive periods. They MUST | ||||||||||||||||||
| match this regex `^[a-z0-9](\.?[a-z0-9])*$`. | ||||||||||||||||||
|
Comment on lines
+73
to
+
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It is also common include the separating period as part of the file extension. |
||||||||||||||||||
|
|
||||||||||||||||||
| The maximum length of a file extension MUST NOT exceed 16 characters. | ||||||||||||||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
for consistency. |
||||||||||||||||||
|
|
||||||||||||||||||
| > The conda ecosystem currently recognizes two artifact extensions: `tar.bz2` and `conda`, | ||||||||||||||||||
| versioned `v1` and `v2` respectively. | ||||||||||||||||||
|
|
||||||||||||||||||
| #### Distribution strings | ||||||||||||||||||
|
|
||||||||||||||||||
| A "distribution string" MAY be used to identify a package artifact without specifying the extension | ||||||||||||||||||
| or the channel. It MUST match the following syntax: | ||||||||||||||||||
|
|
||||||||||||||||||
| ```text | ||||||||||||||||||
| <package name>-<version string>-<build string> | ||||||||||||||||||
|
jaimergp marked this conversation as resolved.
Outdated
jaimergp marked this conversation as resolved.
Outdated
|
||||||||||||||||||
| ``` | ||||||||||||||||||
|
|
||||||||||||||||||
| Distribution strings apply to both distributable and virtual packages. They are used as the name of | ||||||||||||||||||
| the directories where artifacts are extracted in the package cache, for example. | ||||||||||||||||||
|
|
||||||||||||||||||
| > Note: Despite the similarity, distribution strings are not `MatchSpec`-like specifiers and MUST | ||||||||||||||||||
| > NOT be used as such. | ||||||||||||||||||
|
|
||||||||||||||||||
| #### Filenames | ||||||||||||||||||
|
|
||||||||||||||||||
| The filename of distributable conda artifacts is obtained by adding the artifact extension to its | ||||||||||||||||||
| distribution string. It MUST match this syntax: | ||||||||||||||||||
|
|
||||||||||||||||||
| ```text | ||||||||||||||||||
| <package name>-<version string>-<build string>.<extension> | ||||||||||||||||||
| ``` | ||||||||||||||||||
|
|
||||||||||||||||||
| The maximum length of a filename MUST NOT exceed 211 characters. | ||||||||||||||||||
|
|
||||||||||||||||||
| Virtual conda packages do not exist on disk and SHOULD NOT need filename standardization. | ||||||||||||||||||
|
|
||||||||||||||||||
| ### Identifying channels | ||||||||||||||||||
|
|
||||||||||||||||||
| A conda channel is defined as a URL where one can find one or more `repodata.json` files arranged | ||||||||||||||||||
| in one subdirectory (_subdir_) each. `noarch/repodata.json` MUST be present to consider the parent | ||||||||||||||||||
| location a channel. | ||||||||||||||||||
|
|
||||||||||||||||||
| #### Channel base URLs | ||||||||||||||||||
|
|
||||||||||||||||||
| The base URL for the arbitrary location of a repodata file is defined as: | ||||||||||||||||||
|
|
||||||||||||||||||
| ```text | ||||||||||||||||||
| <scheme>://[<authority>][/<path>/][/label/<label name>]/<subdir>/repodata.json | ||||||||||||||||||
| ``` | ||||||||||||||||||
|
|
||||||||||||||||||
| with `<scheme>`, `<authority>` and `<path>` defined by [RFC | ||||||||||||||||||
| 3986](https://datatracker.ietf.org/doc/html/rfc3986#section-3.2). | ||||||||||||||||||
|
|
||||||||||||||||||
| Taken the channel definition above, the base URL without trailing slashes is thus: | ||||||||||||||||||
|
|
||||||||||||||||||
| ```text | ||||||||||||||||||
| <scheme>://[<authority>][/<path>/][/label/<label name>] | ||||||||||||||||||
| ``` | ||||||||||||||||||
|
|
||||||||||||||||||
| For example, given `https://conda.anaconda.org/conda-forge/noarch/repodata.json`, the part leading | ||||||||||||||||||
| to `noarch/repodata.json` and thus base URL is `https://conda.anaconda.org/conda-forge`. For local | ||||||||||||||||||
| repodata such as `file:///home/username/channel/noarch/repodata.json`, the channel base URL is | ||||||||||||||||||
| `file:///home/username/channel`. | ||||||||||||||||||
|
|
||||||||||||||||||
| When present, each path component MUST only contain lowercase ASCII letters, numbers, underscores, | ||||||||||||||||||
| periods, and dashes. They MUST NOT start with a period or a dash. They SHOULD start and end with a | ||||||||||||||||||
| letter or a number. If present, each path component MUST match this regex: | ||||||||||||||||||
|
|
||||||||||||||||||
| ```re | ||||||||||||||||||
| ^[a-z0-9_][a-z0-9_.-]*$ | ||||||||||||||||||
| ``` | ||||||||||||||||||
|
jaimergp marked this conversation as resolved.
Outdated
|
||||||||||||||||||
|
|
||||||||||||||||||
| For `file://`-based channel URLs, the path component rules MAY be understood as recommendations | ||||||||||||||||||
| only. | ||||||||||||||||||
|
|
||||||||||||||||||
| The maximum length of an individual path component in a channel base URL MUST NOT exceed 128 | ||||||||||||||||||
| characters. The maximum length of a channel base URL SHOULD NOT exceed 256 characters. | ||||||||||||||||||
|
|
||||||||||||||||||
| #### Channel names | ||||||||||||||||||
|
|
||||||||||||||||||
| For convenience, the channel _name_ is defined as the concatenation of `scheme`, `authority` and | ||||||||||||||||||
| `path` components of a channel URL. At least one of `authority` or `path` SHOULD be present. In | ||||||||||||||||||
| their absence, the channel name MUST be considered empty, regardless the scheme. Empty channel | ||||||||||||||||||
| names SHOULD NOT be used. | ||||||||||||||||||
|
Comment on lines
+160
to
+
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This deviates, e.g., from how This is not to say it does not make sense to include the "label" concept at all -- it's certainly good to have this distribution-specific part specified since it's widely used, e.g., in |
||||||||||||||||||
|
|
||||||||||||||||||
| When the scheme and authority fields are missing, the full URL can be inferred with these rules: | ||||||||||||||||||
|
|
||||||||||||||||||
| - If the channel name matches the regex `^\.{0,2}[/\\].*$`, or if it matches the regex | ||||||||||||||||||
| `^[A-Z]:([\\/].*)?$` (for Windows drives), it SHOULD be understood as the path component of a | ||||||||||||||||||
| `file://` URL. | ||||||||||||||||||
| - Otherwise, it SHOULD be understood as a `http[s]://` URL. The tool SHOULD assume a default scheme | ||||||||||||||||||
| and authority (e.g. `https://conda.anaconda.org`), and take the rest as a path component. | ||||||||||||||||||
|
jaimergp marked this conversation as resolved.
Outdated
|
||||||||||||||||||
|
|
||||||||||||||||||
| #### Subdir names | ||||||||||||||||||
|
|
||||||||||||||||||
| Channel subdir names MUST either be the literal `noarch` or a string following the syntax | ||||||||||||||||||
| `{os}-{arch}`, where `{os}` and `{arch}` MUST only consist of lowercase ASCII letters and numbers. | ||||||||||||||||||
| Non-`noarch` subdirs MUST match this regex: `^[a-z0-9]+-[a-z0-9]+$`. | ||||||||||||||||||
|
|
||||||||||||||||||
| The maximum length of a subdir name MUST NOT exceed 32 characters. | ||||||||||||||||||
|
|
||||||||||||||||||
| #### Label names | ||||||||||||||||||
|
|
||||||||||||||||||
| Channel label names MUST only consist of ASCII letters, digits, underscores, hyphens, forward | ||||||||||||||||||
| slashes, periods, and whitespace. They MUST start with a letter. They MUST match this regex: | ||||||||||||||||||
| `^[a-zA-Z][0-9a-zA-Z_\-\./]*$`. | ||||||||||||||||||
|
|
||||||||||||||||||
| The label `nolabel` is reserved and MUST only be used for conda packages which have no other | ||||||||||||||||||
| labels. In other words, in the space of labels, the empty set is represented by the labels | ||||||||||||||||||
| `nolabel`. | ||||||||||||||||||
|
Comment on lines
+191
to
+
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not aware of
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Related to OCI. |
||||||||||||||||||
|
|
||||||||||||||||||
| A URL for a package, repodata, etc. without a label component MUST be assumed to have the default | ||||||||||||||||||
| label `main`. | ||||||||||||||||||
|
|
||||||||||||||||||
|
beckermr marked this conversation as resolved.
Outdated
|
||||||||||||||||||
| The maximum length of a label name MUST NOT exceed 128 characters. | ||||||||||||||||||
|
beckermr marked this conversation as resolved.
|
||||||||||||||||||
|
|
||||||||||||||||||
| ## Backwards compatibility | ||||||||||||||||||
|
|
||||||||||||||||||
| The conda subdir and package name regexes are backwards compatible with the current `conda` | ||||||||||||||||||
| implementation (25.3) and all existing packages on the `defaults` and `conda-forge` channels, | ||||||||||||||||||
| except for the `__anaconda_core_depends` package on the `defaults` channel. See [this | ||||||||||||||||||
| comment](https://github.com/conda/ceps/pull/116#discussion_r1992234677). | ||||||||||||||||||
|
|
||||||||||||||||||
| The regex for labels was pulled from an anaconda.org error message describing the set of valid | ||||||||||||||||||
| labels. | ||||||||||||||||||
|
|
||||||||||||||||||
| As of 2025-03-12T19:00Z, of the ~1.9M channel names on anaconda.org: | ||||||||||||||||||
|
|
||||||||||||||||||
| - 7,219 violate the regex `^[a-z0-9]+((-|_|.)[a-z0-9]+)*$`; | ||||||||||||||||||
| - 98 violate the regex `^[a-z0-9][a-z0-9_.-]*$` (allowing channel names to end with `_`, `.`, or | ||||||||||||||||||
| `-`); and | ||||||||||||||||||
| - 6 violate `^[a-z0-9_][a-z0-9_.-]*$` (allowing channel names to start with `_`). Of those six, | ||||||||||||||||||
| five start with `.`, and the other starts with `~`. | ||||||||||||||||||
|
|
||||||||||||||||||
| See [this comment](https://github.com/conda/ceps/pull/116#discussion_r1992154574) for more details. | ||||||||||||||||||
| The authors have excluded the channel names in the last case that start with `.` or `~` given | ||||||||||||||||||
| possible security implications. A low percentage, ~0.4%, of channels do not match the | ||||||||||||||||||
| recommendations for channel names above, but are allowed. | ||||||||||||||||||
|
|
||||||||||||||||||
| The maximum lengths allowed for the different fields have been chosen so the resulting path | ||||||||||||||||||
| components (directory names, filenames) comfortably fit in a the 255-char maximum limit some | ||||||||||||||||||
| filesystems impose. As of 2025-03-01T13:00Z, there are no violations of these limits in any of the | ||||||||||||||||||
| packages published for `conda-forge`, `bioconda` and `defaults`. See [this | ||||||||||||||||||
| comment](https://github.com/conda/ceps/pull/116#issuecomment-2763392999) and [this | ||||||||||||||||||
| comment](https://github.com/conda/ceps/pull/116#issuecomment-2759130187) for more details. | ||||||||||||||||||
|
|
||||||||||||||||||
| ## Copyright | ||||||||||||||||||
|
|
||||||||||||||||||
| All CEPs are explicitly [CC0 1.0 Universal](https://creativecommons.org/publicdomain/zero/1.0/). | ||||||||||||||||||
|
|
||||||||||||||||||
| <!-- links --> | ||||||||||||||||||
|
|
||||||||||||||||||
| [RFC2119]: https://www.ietf.org/rfc/rfc2119.txt | ||||||||||||||||||
Uh oh!
There was an error while loading. Please reload this page.