-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
implement server spool to disk & replay #203
Comments
I probably should remove CliEvent entirely but that's a LOT of work so not now. This change makes the raw span & events available downstream without breaking things for now, to make the spool idea in #203 easier. Also use a more descriptive name for the tracepb import.
My use case would be covered by otel-cli writing to files in a directory if can't deliver to the Collector as configured, and then the Collector reading from those files. I'd have my own format in mind—long story—so I'd want to bundle the CLI and Collector with matched exporters and receivers. To get events in the rough order I'd name the files such that their lexical order roughly matched the time order. A KSUID would do, or zero-padded microseconds since epoch followed by a delimiter and the writer's PID to avoid collisions. |
Write to file on fail would be a later feature I think. First would be adding the spooler & replayer which solve some of those problems and some I've dealt with IRL. Then later maybe I can add like, alternate endpoints for failover. I'm reluctant to do this though because otel-cli explicitly relies on the collector for most fault management, and doing so removes a lot of complexity from the tool. The disk format I have in mind is files named Why does ordering matter? As I understand things, I can send unordered spans up to most observability tools and they'll use the internal timestamps to do ordering. The collector doesn't care, Honeycomb doesn't. Ordering is really hard to get just right so I'd rather avoid it altogether if I can. Using trace & span id is simple and light. I'll probably build this even if it's not exactly what you need right now, because it seems useful. I still intend to move away from using the collector's exporters because they have too many side-effects (e.g. reads environment variables and I can't block it) and features otel-cli doesn't need. |
Back when I was doing tracing work on Tinkerbell, a hard problem was networking during OS installation across root pivots, chroots, containers, and so on. I think the spool idea really shines here. cc @nshalman So what if: # in early boot... and as bind or volume mounts into containers & chroots...
export otel_spool_path="/tmp/otel"
export OTEL_EXPORTER_OTLP_ENDPOINT="file://${otel_spool_path}"
mkdir -p "${otel_spool_path}"
# you could use a tmpfs esp if crossing user boundaries
mount -t tmpfs -o size=100M,mode=1777 none "${otel_spool_path}"
# bind mounts work great across chroots like the example below
mount -o bind "${otel_spool_path}" "/mnt${otel_spool_path}"
# docker volumes make it so you can trace containers with no networking
docker run -ti --volume "${otel_spool_path}:${otel_spool_path}" image:tag \
--env OTEL_EXPORTER_OTLP_ENDPOINT="${OTEL_EXPORTER_OTLP_ENDPOINT}" \
otel-cli exec --name "look at me, I'm a container!" \
/installer.sh
# anywhere in the code, without worrying one bit about networking:
otel-cli exec --name "chroot into mountpoint" \
/bin/env OTEL_EXPORTER_OTLP_ENDPOINT="${OTEL_EXPORTER_OTLP_ENDPOINT}" \
/sbin/chroot /mnt \
/bin/env OTEL_EXPORTER_OTLP_ENDPOINT="${OTEL_EXPORTER_OTLP_ENDPOINT}" \
/bin/otel-cli exec \
--endpoint $OTEL_EXPORTER_OTLP_ENDPOINT \
--name "run the installer inside the chroot" \
/installer.sh install This creates a bunch of protobuf files. So later, when networking is on you can do: # read all the files on disk, send them to the upstream OTLP endpoint, delete the file after
otel-cli replay \
--delete \
--spool-path "${otel_spool_path}" \
--endpoint grpc://otel-collector.mydomain.com note: env endpoint duplication can go away soon when I finish some other work on envvars and exporters |
While working on #205 I noticed that the specs have a section on exporting to files: https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/protocol/file-exporter.md |
Along with the file exporter is the file receiver: https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver/filereceiver. I haven't tried it myself but my understanding is this would allow exporting to a file, then later reading with the collector and exporting to wherever to collector supports |
Catching up late… the dev file exporter and receiver could be convenient for this use case of otel-cli queueing its own records until the Collector is [back] up. We'd benefit from adding wildcard matching to the receiver so the Collector could pick up its own records. |
🤔 I suppose otel-cli needs otel collector interop tests at some point. If
handing off files to the collector is useful that's enough motivation to do
the work... tho, we could also test this effectively by importing the
collector backend and then no yamls or processes or containers. that sounds
more maintainable. otel-cli uses less of it now but still follows upstream
opentelemetry-go and bits of collector, so it's not a new dependency
Either way I'm sold on implementing the standard. Will look at it more when
I'm back at my computer.
Please suggest cli syntax. What example would we add to the README.md?
…On Sun, Jul 30, 2023, 5:49 PM Garth Kidd ***@***.***> wrote:
Catching up late… the dev file exporter and receiver could be convenient
for this use case of otel-cli queueing its own records until the Collector
is [back] up. We'd benefit from adding wildcard matching to the receiver so
the Collector could pick up its own records.
—
Reply to this email directly, view it on GitHub
<#203 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AACWSHTE7NJK4BUJ4L645ILXS3JE5ANCNFSM6AAAAAAX5HK6FI>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Via some discussion in #183: It might be useful to have otel-cli be able to write spans to disk, and then replay them. This can be handy in scenarios when the network/collector aren't available when the traces are generated, but will be later.
Also creates a path for having containers with no networking write to a bind mounted path, then an external otel-cli picks them up and sends along.
Seems like plain
flock()
should be sufficient for locking. I think I would only support protobuf to avoid type erasure. I started looking at how much work it is, and it's not too bad, and not super invasive. I did the first bits to expose the raw protobuf spans and events through CliEvent here: https://github.com/equinix-labs/otel-cli/tree/spool-spans-to-disk@garthk did I understand your use case correctly?
The text was updated successfully, but these errors were encountered: