-
Notifications
You must be signed in to change notification settings - Fork 631
tracebox: Explicit daemon management #3726
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
LalitMaganti
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Biggest request: please ensure you treat this CL with the gravity and importance it deserves. We try very hard in Perfetto to not make big breaking changes and when we do, it's really important that we get ahead of questions and make sure people proactively are todl exactly what they need to do. For example, when I broken people with the track migration last year, I sent out https://perfetto.dev/docs/analysis/perfetto-sql-backcompat
For example I think people should be told if they are doing tracebox by default, they should really try to switch to using ctl or if they just want to unblock as fast as possible, to use --autodaemonize for old beheaviour.
I also think this might require us to change some of our docs pages: please do an audit and make necessary changes.
include/perfetto/base/build_configs/bazel/perfetto_build_flags.h
Outdated
Show resolved
Hide resolved
1ffd9b3 to
daf629a
Compare
|
Thanks for the review:
|
cdcd105 to
455bcb1
Compare
4d449d9 to
1ea1253
Compare
🎨 Perfetto UI Build✅ UI build is ready: https://storage.googleapis.com/perfetto-ci-artifacts/gh-19646592328-ui/ui/index.html |
fafffcf to
ee510ae
Compare
|
Overall lgtm but I don't quite understand the systemd changes. Can we maybe pull that out into a separate cl? The rest looks really good! |
- Introdcue `tracebox ct` for explicit daemon lifecycle management - Default behavior now required daemons to be running (linux/macos) - `--autodaemonize` flag preserves old auto-spawn behavior - `--system-sockets` deprecated with a warning printed on usage - Windows maintains backward compatibility by auto-spawning with system sockets - Systemd checks to nudge user to use them Fixes: b/458318005 Fixes: #3540
ee510ae to
9ba1b54
Compare
9ba1b54 to
d3730a8
Compare
primiano
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome work. In hindsight, no this was not a fixit item :D live and learn
BUt we are here now, so let's go for this.
I have a bunch of comments.
Also as a meta comment: do we need to support windows? can we just error out on OS_WIN for now?
I doubt anybody would ever use tracebox on windows. what for? traced_probes/perf don't exist. the SDK cross-process is "weird" (it uses TCP, nobody will want that in production) . Windows cross-process support exists really because I didn't want to #ifdef out code all over the places, and was easier to add support over a TCP socket.
but it's really a 3rd class citizen
| > tracebox -t 10s -o trace.pftrace sched | ||
| > tracebox ctl stop | ||
| MODE 2: Autodaemonize mode |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd call this
MODE 2: --autodaemonize mode (legacy behavior, discouraged)
| memmove(static_cast<void*>(&argv[i]), static_cast<void*>(&argv[i + 1]), | ||
| sizeof(char*) * static_cast<size_t>(argc - i - 1)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can do just:
std::move(argv + i + 1, argv + argc, argv + i);
| if (autodaemonize) { | ||
| if (use_system_sockets) { | ||
| PERFETTO_ELOG( | ||
| "Warning: --system-sockets with --autodaemonize is supported but " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'd say "supported for legacy reasons"
| } | ||
|
|
||
| if (use_system_sockets) { | ||
| PERFETTO_FATAL( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fatal seems a bit aggressive given that this is just the new default?
I would
- Move this message in the for loop above. there i sno need to set a variable, just to emit a lot here:
- SOften this into a PERFETTO_ELOG ?
| and tearing down the services for you. | ||
| cmdline client in one binary. It can be used in two modes: | ||
| MODE 1: Daemon mode (Recommended) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a (Default behavior, reccomended)
| if (pid && kill(pid, SIGTERM) == 0) { | ||
| daemons_stopped++; | ||
| printf("%s terminated (PID: %d)\n", daemon, pid); | ||
| if (remove(pid_path.c_str()) != 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't the remove be done by the process?
Otherwise you can you tell when a process has actually ended?
There is an expectation that "ctl stop" waits for the termination, so you can call start > stop > start and it should work.
I think the way this could work is: the daemon removes the pid when it quits, and here you poll for the deletion of the pid
| ServiceSockets sockets = GetRunningSockets(); | ||
| if (sockets.IsValid()) { | ||
| printf("Status: Daemons are already running with %s\n", | ||
| sockets.ToString().c_str()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd add something more helpful like ". Maybe the daemons have been started via systemd/android init or other means?"
| constexpr uint32_t kDaemonStartWaitUs = 1000 * 1000; // 1s | ||
| ServiceSockets sockets = GetRunningSockets(); | ||
| if (sockets.IsValid()) { | ||
| printf("Status: Daemons are already running with %s\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/with/on/
| printf("%s started (PID: %d)\n", daemon, pid); | ||
| } | ||
|
|
||
| base::SleepMicroseconds(kDaemonStartWaitUs); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
imho this needs to be a poll of some sort. 1 s is too low on an emulator, and if you go higher, it's annoying to wait that long.
Poll every 250ms for at most 5s?
| } | ||
|
|
||
| int CtlStatus() { | ||
| ServiceSockets sockets = GetRunningSockets(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd be a bit more pedantic here. I'd check both for (1) GetRunningSockets(); (2) the existance of pid files (at least just for traced)
If 1 && 2: you are happy
if 1 && not(2) you say "The daemon seems running but the pid file doesn't exist. looks like something else started the daemon (systemd? initrc?). Use lsof ... (figure out the command) for details"
tracebox ctlfor explicit daemon lifecycle management--autodaemonizeflag preserves old auto-spawn behavior--system-socketsdeprecated with a warning printed on usageFixes: b/458318005
Fixes: #3540