Skip to content

Conversation

@sashwinbalaji
Copy link
Member

@sashwinbalaji sashwinbalaji commented Nov 19, 2025

  • Introduce tracebox ctl for explicit daemon lifecycle management
  • Default behavior now required daemons to be running
  • --autodaemonize flag preserves old auto-spawn behavior
  • --system-sockets deprecated with a warning printed on usage

Fixes: b/458318005
Fixes: #3540

@sashwinbalaji sashwinbalaji requested a review from a team as a code owner November 19, 2025 00:41
Copy link
Member

@LalitMaganti LalitMaganti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Biggest request: please ensure you treat this CL with the gravity and importance it deserves. We try very hard in Perfetto to not make big breaking changes and when we do, it's really important that we get ahead of questions and make sure people proactively are todl exactly what they need to do. For example, when I broken people with the track migration last year, I sent out https://perfetto.dev/docs/analysis/perfetto-sql-backcompat

For example I think people should be told if they are doing tracebox by default, they should really try to switch to using ctl or if they just want to unblock as fast as possible, to use --autodaemonize for old beheaviour.

I also think this might require us to change some of our docs pages: please do an audit and make necessary changes.

@sashwinbalaji sashwinbalaji force-pushed the dev/sashwinbalaji/tracebox_applet branch from 1ffd9b3 to daf629a Compare November 19, 2025 01:19
@sashwinbalaji
Copy link
Member Author

sashwinbalaji commented Nov 19, 2025

Thanks for the review:

  1. for communication plan was to send out mail(+announcement) will look into https://perfetto.dev/docs/analysis/perfetto-sql-backcompat
  2. regarding doc it's in pipeline as a separate cl as there are multiple files using it: https://source.chromium.org/search?q=tracebox%20&sq=&ss=chromium%2Fchromium%2Fsrc:third_party%2Fperfetto%2Fdocs%2F
  3. Switching to draft until windows mode is aligned, tracebiox.md rereviewed and try out --background option instead of reinventing the wheel

@sashwinbalaji sashwinbalaji marked this pull request as draft November 19, 2025 01:21
@sashwinbalaji sashwinbalaji force-pushed the dev/sashwinbalaji/tracebox_applet branch 5 times, most recently from cdcd105 to 455bcb1 Compare November 20, 2025 16:56
@sashwinbalaji sashwinbalaji force-pushed the dev/sashwinbalaji/tracebox_applet branch from 4d449d9 to 1ea1253 Compare November 21, 2025 10:49
@github-actions
Copy link

github-actions bot commented Nov 21, 2025

🎨 Perfetto UI Build

✅ UI build is ready: https://storage.googleapis.com/perfetto-ci-artifacts/gh-19646592328-ui/ui/index.html

@sashwinbalaji sashwinbalaji force-pushed the dev/sashwinbalaji/tracebox_applet branch 5 times, most recently from fafffcf to ee510ae Compare November 21, 2025 19:25
@sashwinbalaji sashwinbalaji marked this pull request as ready for review November 21, 2025 19:26
@LalitMaganti
Copy link
Member

LalitMaganti commented Nov 21, 2025

Overall lgtm but I don't quite understand the systemd changes. Can we maybe pull that out into a separate cl? The rest looks really good!

- Introdcue `tracebox ct` for explicit daemon lifecycle management
- Default behavior now required daemons to be running (linux/macos)
- `--autodaemonize` flag preserves old auto-spawn behavior
- `--system-sockets` deprecated with a warning printed on usage
- Windows maintains backward compatibility by auto-spawning with system
  sockets
- Systemd checks to nudge user to use them

Fixes: b/458318005
Fixes: #3540
@sashwinbalaji sashwinbalaji force-pushed the dev/sashwinbalaji/tracebox_applet branch from ee510ae to 9ba1b54 Compare November 24, 2025 19:23
@sashwinbalaji sashwinbalaji force-pushed the dev/sashwinbalaji/tracebox_applet branch from 9ba1b54 to d3730a8 Compare November 24, 2025 19:25
Copy link
Member

@primiano primiano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome work. In hindsight, no this was not a fixit item :D live and learn

BUt we are here now, so let's go for this.

I have a bunch of comments.
Also as a meta comment: do we need to support windows? can we just error out on OS_WIN for now?
I doubt anybody would ever use tracebox on windows. what for? traced_probes/perf don't exist. the SDK cross-process is "weird" (it uses TCP, nobody will want that in production) . Windows cross-process support exists really because I didn't want to #ifdef out code all over the places, and was easier to add support over a TCP socket.
but it's really a 3rd class citizen

> tracebox -t 10s -o trace.pftrace sched
> tracebox ctl stop
MODE 2: Autodaemonize mode
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd call this

MODE 2: --autodaemonize mode (legacy behavior, discouraged)

Comment on lines +254 to +255
memmove(static_cast<void*>(&argv[i]), static_cast<void*>(&argv[i + 1]),
sizeof(char*) * static_cast<size_t>(argc - i - 1));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can do just:

std::move(argv + i + 1, argv + argc, argv + i);

if (autodaemonize) {
if (use_system_sockets) {
PERFETTO_ELOG(
"Warning: --system-sockets with --autodaemonize is supported but "
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'd say "supported for legacy reasons"

}

if (use_system_sockets) {
PERFETTO_FATAL(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fatal seems a bit aggressive given that this is just the new default?
I would

  1. Move this message in the for loop above. there i sno need to set a variable, just to emit a lot here:
  2. SOften this into a PERFETTO_ELOG ?

and tearing down the services for you.
cmdline client in one binary. It can be used in two modes:
MODE 1: Daemon mode (Recommended)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a (Default behavior, reccomended)

if (pid && kill(pid, SIGTERM) == 0) {
daemons_stopped++;
printf("%s terminated (PID: %d)\n", daemon, pid);
if (remove(pid_path.c_str()) != 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't the remove be done by the process?
Otherwise you can you tell when a process has actually ended?
There is an expectation that "ctl stop" waits for the termination, so you can call start > stop > start and it should work.

I think the way this could work is: the daemon removes the pid when it quits, and here you poll for the deletion of the pid

ServiceSockets sockets = GetRunningSockets();
if (sockets.IsValid()) {
printf("Status: Daemons are already running with %s\n",
sockets.ToString().c_str());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd add something more helpful like ". Maybe the daemons have been started via systemd/android init or other means?"

constexpr uint32_t kDaemonStartWaitUs = 1000 * 1000; // 1s
ServiceSockets sockets = GetRunningSockets();
if (sockets.IsValid()) {
printf("Status: Daemons are already running with %s\n",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/with/on/

printf("%s started (PID: %d)\n", daemon, pid);
}

base::SleepMicroseconds(kDaemonStartWaitUs);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

imho this needs to be a poll of some sort. 1 s is too low on an emulator, and if you go higher, it's annoying to wait that long.
Poll every 250ms for at most 5s?

}

int CtlStatus() {
ServiceSockets sockets = GetRunningSockets();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be a bit more pedantic here. I'd check both for (1) GetRunningSockets(); (2) the existance of pid files (at least just for traced)

If 1 && 2: you are happy
if 1 && not(2) you say "The daemon seems running but the pid file doesn't exist. looks like something else started the daemon (systemd? initrc?). Use lsof ... (figure out the command) for details"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants