Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bazelisk run doesn't propagate signals to the child process #512

Open
jschaf opened this issue Oct 27, 2023 · 0 comments
Open

Bazelisk run doesn't propagate signals to the child process #512

jschaf opened this issue Oct 27, 2023 · 0 comments
Labels
P2 We'll consider working on this in future. (Assignee optional) type:bug Something isn't working

Comments

@jschaf
Copy link

jschaf commented Oct 27, 2023

Reproduction

The situation to reproduce is a bit complex:

  1. I have Go binary at //erp/dev.
  2. //erp/dev runs other processes via bazelisk run, notably a Go-wrapper around Postgres.

The resulting process tree:

bazelisk run //erp/dev 
  # Starts a Go binary
  # The Go binary in turn calls the Go-wrapper for Postgres
  bazelisk run //db/go_postgres
    # go_postgres is a Go binary that calls the Postgres binary
    postgres

The problem is that if //erp/dev sends SIGINT directly to the //db/go_postgres using the Go stdlib exec.Cmd, bazelisk run swallows the signal.

However, if I type ctrl-c at the terminal, Postgres receives the signal. I think this is because the shell wires up the process attributes differently from Go (maybe https://stackoverflow.com/a/35435038/30900?)

Context

As for why bazelisk run has this behavior:

Previously, when running bazelisk run under a shell, typing ctrl-c would send SIGINT twice to the child process. I think that's the same behavior I saw--processes started with Go propagate signals differently from processes started by the shell.

#307 noted the issue and recommended ignoring SIGINT. The resulting PR #322 fixed the issue by ignoring SIGINT but has a few related problems

The relevant bazelisk code:

// Only forward SIGTERM to our child process.

func runBazel(/*omitted*/) (int, error) {
	cmd := makeBazelCmd()
	cmd.Start()

	c := make(chan os.Signal)
	signal.Notify(c, os.Interrupt, syscall.SIGTERM)
	go func() {
		s := <-c

		// Only forward SIGTERM to our child process.
		if s != os.Interrupt {
			cmd.Process.Kill()
		}
	}()

The two related problems:

  1. The channel isn't buffered, meaning the signal delivery isn't guaranteed.
    From https://pkg.go.dev/os/signal#Notify:

    Package signal will not block sending to c: the caller must ensure that c
    has sufficient buffer space to keep up with the expected signal rate. For a
    channel used for notification of just one signal value, a buffer of size 1
    is sufficient.

    See also: https://stackoverflow.com/a/68593935/30900

  2. The code calls cmd.Process.Kill(), which sends SIGKILL; it doesn't forward
    the caught signal.

Design sketch

For my use case, I'd like to be able to forward whatever signal I want when starting processes via bazelisk run from Go. This is particularly important for Postgres, which has robust signal handling:

  • SIGTERM: smart shutdown
  • SIGINT: fast shutdown
  • SIGQUIT: immediate shutdown
  • SIGHUP: reload config

As a path forward (not yet validated) that avoids the double signaling:

  1. bazelisk run starts processes so that automatic signal propagation is disabled:
cmd.SysProcAttr = &syscall.SysProcAttr{
    Setpgid: true,
    Pgid:    0,
}
  1. bazelisk run forwards all signals to the child process. We avoid the double SIGINT via step 1.
@fweikert fweikert added the type:bug Something isn't working label Jan 19, 2024
@fweikert fweikert added the P2 We'll consider working on this in future. (Assignee optional) label Feb 6, 2024
tjgq added a commit to tjgq/bazelisk that referenced this issue Nov 28, 2024
SIGQUIT is specially recognized and causes a Java thread dump to be printed
without aborting the invocation. This behavior should be preserved when running
Bazel through Bazelisk.

Note that this also suppresses printing a Go stack dump on SIGQUIT. This is
intentional, because the Go stack dump is confusing: users tend to report it
instead of the much more valuable Java thread dump we asked them for.

Partially addresses bazelbuild#512. I'm not addressing it fully at this time because I'm
primarily concerned about signal handling by the Bazel client, not by an
arbitrary process executed by `bazel run`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P2 We'll consider working on this in future. (Assignee optional) type:bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants