Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add document defining an OpenTelemetry Collector #4313

Open
wants to merge 31 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
512a3b4
add document defining an openTelemetry Collector
codeboten Nov 27, 2024
c71e06d
adding details regarding components and distributions
codeboten Nov 27, 2024
9d61bf7
typo
codeboten Nov 27, 2024
2405824
Update specification/collector/README.md
codeboten Nov 27, 2024
94e2673
added links
codeboten Nov 27, 2024
b41fb8c
Merge branch 'main' into codeboten/add-spec-for-collector
codeboten Nov 27, 2024
27699e9
Update specification/collector/README.md
codeboten Nov 28, 2024
9844574
Update specification/collector/README.md
codeboten Dec 2, 2024
d6a319f
Update specification/collector/README.md
codeboten Dec 2, 2024
7236f08
Update specification/collector/README.md
codeboten Dec 2, 2024
46829cf
Merge branch 'main' into codeboten/add-spec-for-collector
codeboten Dec 2, 2024
010c65f
Update specification/collector/README.md
codeboten Dec 5, 2024
d6b2ab4
Update specification/collector/README.md
codeboten Dec 5, 2024
4a42bee
Merge branch 'main' into codeboten/add-spec-for-collector
codeboten Dec 6, 2024
abc13c8
add details about configuration file
codeboten Dec 6, 2024
17cc1bf
Merge branch 'main' into codeboten/add-spec-for-collector
codeboten Dec 10, 2024
b589775
Update specification/collector/README.md
codeboten Dec 10, 2024
9229492
Update specification/collector/README.md
codeboten Dec 10, 2024
0704000
Update specification/collector/README.md
codeboten Dec 10, 2024
b0dd212
fix lint
codeboten Dec 10, 2024
b20d4d8
adding details regarding components that may use the same identifier
codeboten Dec 10, 2024
d73e36a
Update specification/collector/README.md
codeboten Dec 11, 2024
61d3666
Update specification/collector/README.md
codeboten Dec 11, 2024
af4a225
Update specification/collector/README.md
codeboten Dec 16, 2024
3e7d6c5
Merge branch 'main' into codeboten/add-spec-for-collector
codeboten Dec 16, 2024
167de2f
Merge branch 'main' into codeboten/add-spec-for-collector
codeboten Dec 17, 2024
8101579
add document status
codeboten Dec 18, 2024
1f98be5
update should to must
codeboten Dec 20, 2024
d91f7ea
Merge branch 'main' into codeboten/add-spec-for-collector
codeboten Dec 23, 2024
d8c0185
Merge branch 'main' into codeboten/add-spec-for-collector
codeboten Jan 7, 2025
322b09a
Merge branch 'main' into codeboten/add-spec-for-collector
codeboten Jan 15, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
64 changes: 64 additions & 0 deletions specification/collector/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
<!--- Hugo front matter used to generate the website version of this page:
path_base_for_github_subdir:
from: tmp/otel/specification/collector/_index.md
to: collector/README.md
--->

# OpenTelemetry Collector
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OpenTelemetry Collector is never defined. Is it a source code artifact? A binary?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's one of the things I was trying to get at here. Since there's no binary plugin mechanism it seems that the source would need to be available for it to be extended in the manner contemplated, but that's not clear or explicit in the current state.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since there's no binary plugin mechanism it seems that the source would need to be available for it to be extended in the manner contemplated, but that's not clear or explicit in the current state.

Is the lack of binary plugin mechanism something that the OpenTelemetry Collector SIG wants to solve? Are there technical blockers?

Binary and dynamic loading plugin seem to be an established pattern. For example:

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Fluent Bit example is not necessarily apposite as it involves a C application dynamically loading shared libraries built with the Go toolchain using CGO (which is generally prohibited in the Collector codebase).

Go does have a native plugin mechanism, though it comes with many caveats and is widely regarded as a bad idea that can't be dropped due to compatibility guarantees. Its documentation sums up its litany of restrictions in this way, which sounds a lot like a suggestion to use something like ocb:

Together, these restrictions mean that, in practice, the application and its plugins must all be built together by a single person or component of a system. In that case, it may be simpler for that person or component to generate Go source files that blank-import the desired set of plugins and then compile a static executable in the usual way.

codeboten marked this conversation as resolved.
Show resolved Hide resolved

**Status**: [Development](../document-status.md)

The goal of this document is for users to be able to easily switch between
OpenTelemetry Collector Distros while also ensuring that components produced by
the OpenTelemetry Collector SIG are able to work with any vendor who claims
support for an OpenTelemetry Collector.
Comment on lines +11 to +14
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand this goal. If a vendor produces a collector distribution that has a subset of available components because those are the components relevant to their service offerings and that they're willing to support, where do any other components (whether hosted in an OTel repo or not) fit into that picture? Do we mean that a distribution must offer end users the ability to modify its source and create their own build? We should be explicit about that if that is the case.

Given that the licensing of the collector's source code does not require that distribution of derivative works happen in source form I'm not sure that we have much ability here to enforce such a requirement. We can certainly try to use the "OpenTelemetry" mark as a cudgel, but I'm not sure it'll be as effective as may be desirable since the terms "collector" and "distribution" are very broad. It could perhaps be argued that "OpenTelemetry Collector" is a protectable mark and maybe even that "Collector" has acquired secondary meaning in this limited scope, but protecting such a mark against genericization is going to be a Sisyphean task.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see this definition as separation from the term Distribution defined below. A Distribution is a specific compiled OpenTelemetry Collector with a specific set of OpenTelemetry Collector Components that the maintainer (the user in this case) decided to add. It is a OpenTelemetry Collector bc the maintainer was able to bring their chosen OpenTelemetry Collector components to it.

Something is not an OpenTelemetry Collector if it cannot support OpenTelemetry Collector Components. Maybe the word additional below is unnecessary and could be removed?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that the licensing of the collector's source code does not require that distribution of derivative works happen in source form I'm not sure that we have much ability here to enforce such a requirement

We potentially have leverage over:

  • Trademark usage if "OpenTelemetry Collector" becomes a trademark
  • What we list on our registry and website and what we promote
  • What wording can be used in 'official' OTel events

I think we have enough leverage here to make this worth it

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that having a registered "OpenTelemetry Collector" mark is sufficient here as nominative fair use would allow anyone preparing a distribution (in the colloquial sense, not Distribution however we seek to define it) to identify it as such. The Linux Foundation trademark usage guidelines also call out specifically this sort of usage as acceptable for indicating products are related to or based on the project that produces the product bearing their marks.

Obviously the project can control what it puts on its website and what marketing collateral is used in conjunction with events operated by LF/CNCF, but that doesn't seem like effective leverage over an actor who has no need or interest in such things.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would generally say if someone isn't interested in "playing nice" then it doesn't really matter what we say or what we don't say. The solution to enforceable marks is offering certification and conformance suites that are attached to actual trademarks (e.g., "OTLP Inside" or whatever). This document is guidance for the community as much as it is guidance for external parties.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What @austinlparker said along with my thinking that its important to define this, and include this requirement, to make clear why opentelemetry.io would or wouldn't list project Y as a Collector or Distribution.


- An OpenTelemetry Collector _MUST_ accept an [OpenTelemetry Collector configuration
file](#opentelemetry-collector-configuration-file).
- An OpenTelemetry Collector _MUST_ be able to include additional compatible
[Collector components](#opentelemetry-collector-components) that
the user wishes to include.

## OpenTelemetry Collector configuration file
jpkrohling marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this section really required when it is already defined above that it must accept an OpenTelemetry Collector configuration?

Besides redundancy it makes this document a living document that would have to remember to be updated if a new top level key is ever added to the collector configuration file -- for the "minimum structure".


An OpenTelemetry Collector configuration file is defined as YAML and _MUST_ support
the following [minimum structure](https://pkg.go.dev/go.opentelemetry.io/collector/otelcol#Config):

```yaml
receivers:
processors:
exporters:
connectors:
extensions:
service:
codeboten marked this conversation as resolved.
Show resolved Hide resolved
telemetry:
pipelines:
```

## OpenTelemetry Collector components

For a library to be considered an OpenTelemetry Collector component, it _MUST_
implement a [Component interface](https://pkg.go.dev/go.opentelemetry.io/collector/component#Component)
defined by the OpenTelemetry Collector SIG.

Components require a unique identfier as a `type` string to be included in an OpenTelemetry
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar concern as above when you could just say it must implement the Component interface. And is this paragraph referring to the type ID in Component which says:

The component ID (combination type + name) is unique for a given component.Kind.

So multiple components can use the same identifier if they are of different Kinds?

Collector. It is possible that multiple components use the same identifier, in which
case the two components cannot be used simultaneously in a single OpenTelemetry Collector. In
order to resolve this, the clashing components must use a different identifier.
codeboten marked this conversation as resolved.
Show resolved Hide resolved

### Compatibility requirements

A component is defined as compatible with an OpenTelemetry Collector when its dependencies are
source- and version-compatible with the Component interfaces of that Collector.

For example, a Collector derived from version tag v0.100.0 of the [OpenTelemetry Collector](https://github.com/open-telemetry/opentelemetry-collector) _MUST_ support all components that
are version-compatible with the Golang Component API defined in the `github.com/open-telemetry/opentelemetry-collector/component` module found in that repository for that version tag.

codeboten marked this conversation as resolved.
Show resolved Hide resolved
## OpenTelemetry Collector Distribution
codeboten marked this conversation as resolved.
Show resolved Hide resolved

An OpenTelemetry Collector Distribution (Distro) is a compiled instance
of an OpenTelemetry Collector with a specific set of components and features. A
Distribution author _MUST_ provide users with tools and/or documentation for adding
their own components to the Distribution's components. Note that the resulting
Comment on lines +60 to +62
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems problematic to me as a MUST. This is, in effect, a requirement that distributions be made available in source form with a license that permits modification (and presumably distribution, though that's not clear). The license under which the Collector is released does not require this and I'm suspicious of the ability to use trademark protections to prevent someone from using the phrasing "Foo Distribution for OpenTelemetry Collector" given that's literally the first "Correct" example in the Linux Foundation trademark guidelines and is a textbook case of nominative fair use.

I think if we want an identifier for compatible distributions that can be effectively controlled we will need a distinctive mark for a compatibility certification that can be granted to distributions that satisfy its requirements, similar to what @tedsuo seems to be describing here.

Beyond those concerns, this requirement also seems excessively vague. What qualifies as "tools and/or documentation"? Is a link to https://go.dev/dl/ sufficient? This probably requires a definition similar to "Corresponding Source" from AGPL-3, which again reinforces the limitations that come from not having this be part of the license under which the collector source code is made available.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The spirit here is to allow users to reuse their components when they move from one distribution to another: the engineering investments made should not be lost. If the distribution is open source and there's clear documentation how to add a new component to it, that's good enough for me. If the distribution is not open source but allows me to enter the Go module name on a web interface somewhere and get a binary out, that's also fine.

I'd see that binary as "tainted" (to use the kernel terminology) and the final binary might not be officially supported (with SLAs) by a service provider, but as an end-user, I'm not locked in.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a specification, so I don't think it's appropriate to leave ambiguity here and rely on interpretation of the "spirit" of the requirement. This was changed from SHOULD to MUST in response to a comment seeking clarification that we intended to require distributions to allow users to modify their source and build new, modified, binaries.

If the distribution is not open source but allows me to enter the Go module name on a web interface somewhere and get a binary out, that's also fine.

I would not expect, and do not think it reasonable to expect, that any vendor offering a closed-source, binary-only distribution will allow users to provide arbitrary code to be built into a new "tainted" binary by that vendor. Doing so would allow for a user to cause a vendor to distribute binaries built from code licensed under terms the vendor has no opportunity to review and which may require, for instance, that any code it is compiled with be distributed under the same terms.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a specification, so I don't think it's appropriate to leave ambiguity here and rely on interpretation of the "spirit" of the requirement

I agree, my comment was more to provide the background, hoping that it would trigger ideas for a new wording.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is defining what OpenTelemetry considers a distribution. To be considered a distribution by the project I'd think there is free reign over restrictions. That is different from trademark which would mean the project could actually stop someone else from saying, "This is an Otel Collector Distribution". So this wouldn't offer that protection, but instead define for others what the project will itself consider and call a distribution.

I'd still support rephrasing this to not requiring docs/tooling if it works with Otel docs and tooling. Which may mean "requiring" the documentation of all components within a distribution (otherwise how else would a user define an equivalent ocb configuration).

binary from updating a Distribution to include new components
is a different Distribution.
Loading