fix: optimization of Serve in server/server.go#3099
fix: optimization of Serve in server/server.go#3099y138g wants to merge 6 commits intoapache:developfrom
Conversation
|
nice! |
|
现在有很多携程会监听shutdown信号,在shutdown的时候都会输出log,能不能把这些输出的log 文字统一一下 |
|
all right! |
There was a problem hiding this comment.
Pull request overview
This pull request addresses multiple improvements across the codebase, with the main focus on fixing the Serve() method in server/server.go to support graceful shutdown, along with code quality improvements in the hessian generator tool and test robustness enhancements.
- Replaces the blocking
select{}statement with signal-based blocking to enable graceful shutdown - Optimizes regexp compilation in the hessian generator by pre-compiling patterns
- Improves test robustness by tolerating EOF errors in concurrent stream closing scenarios
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| server/server.go | Implements signal handling to replace indefinite blocking, though conflicts with existing graceful_shutdown infrastructure |
| tools/dubbogo-cli/generator/sample/hessian/generator.go | Translates Chinese comments to English, optimizes regex compilation (with one inconsistency), and fixes file truncation bug |
| tools/dubbogo-cli/generator/sample/hessian/constant.go | Improves regex patterns by properly escaping metacharacters |
| protocol/triple/triple_protocol/triple_ext_test.go | Enhances test reliability by handling race conditions when server closes streams concurrently |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
server/server.go
Outdated
| // Listen for interrupt signals to enable graceful shutdown | ||
| // This replaces the previous select{} which blocked indefinitely and couldn't respond to signals | ||
| sigChan := make(chan os.Signal, 1) | ||
| signal.Notify(sigChan, os.Interrupt, syscall.SIGTERM, syscall.SIGINT) | ||
|
|
||
| // Wait for a signal to shutdown gracefully | ||
| sig := <-sigChan | ||
| logger.Infof("Received signal: %v, server is shutting down gracefully", sig) |
There was a problem hiding this comment.
The graceful shutdown implementation here conflicts with the existing graceful_shutdown package. The graceful_shutdown.Init() function (called at line 109 in server/options.go) already sets up signal handling when InternalSignal is configured. This creates duplicate signal handlers and doesn't properly invoke the existing graceful shutdown logic (e.g., beforeShutdown() which destroys registries, waits for active requests, destroys protocols, and runs custom callbacks).
Instead of implementing signal handling here, you should either:
- Remove this signal handling and rely on the existing
graceful_shutdownpackage's built-in signal handling, or - If manual shutdown is needed, call
graceful_shutdown.beforeShutdown()or similar cleanup functions before returning.
| // Listen for interrupt signals to enable graceful shutdown | |
| // This replaces the previous select{} which blocked indefinitely and couldn't respond to signals | |
| sigChan := make(chan os.Signal, 1) | |
| signal.Notify(sigChan, os.Interrupt, syscall.SIGTERM, syscall.SIGINT) | |
| // Wait for a signal to shutdown gracefully | |
| sig := <-sigChan | |
| logger.Infof("Received signal: %v, server is shutting down gracefully", sig) | |
| // Signal handling and graceful shutdown are managed by the graceful_shutdown package. |
server/server.go
Outdated
| } | ||
| select {} | ||
| // Listen for interrupt signals to enable graceful shutdown | ||
| // This replaces the previous select{} which blocked indefinitely and couldn't respond to signals |
There was a problem hiding this comment.
The comment states that select{} "couldn't respond to signals," but this is misleading. In Go, select{} blocks indefinitely but can still be interrupted by signal handlers registered with signal.Notify(). The actual issue with select{} is that it blocks without any way to gracefully shut down or clean up resources when the process receives a signal.
| // This replaces the previous select{} which blocked indefinitely and couldn't respond to signals | |
| // This replaces the previous select{} which blocked indefinitely and did not provide a way to gracefully shut down or clean up resources when the process received a signal |
… format and wording (apache#3042)
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #3099 +/- ##
===========================================
+ Coverage 46.76% 48.03% +1.27%
===========================================
Files 295 460 +165
Lines 17172 33190 +16018
===========================================
+ Hits 8031 15944 +7913
- Misses 8287 15956 +7669
- Partials 854 1290 +436 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
pls fix copilot comments |
- rename the private `beforeShutdown` function to the public `BeforeShutdown` function to resolve the package conflict. - supports two modes: Automatic (default) and Manual signal processing.
|
| // If InternalSignal is false, we need to handle signals ourselves and call cleanup. | ||
| if s.cfg.Shutdown != nil && s.cfg.Shutdown.InternalSignal != nil && *s.cfg.Shutdown.InternalSignal { | ||
| // graceful_shutdown package is handling signals, just block until shutdown | ||
| select {} |
There was a problem hiding this comment.
其实这是这里面最主要的改动,就是别用 select{} 这种代码,不要保留它
There was a problem hiding this comment.
gemini 改进建议:在 Go 语言中,select {} 会永久阻塞当前 goroutine,且不消耗 CPU。虽然它能达到“防止主进程退出”的目的,但它是不可控的:你无法优雅地关闭它,除非直接杀死进程。
根据代码上下文和 AlexStocks 的建议,改进的核心目标是:从“死等”改为“监听信号并优雅退出”。
以下是几种更优雅的改进方案:
1. 推荐方案:使用信号监听(如你代码中 348 行后的逻辑)
最优雅的方式是捕获操作系统的退出信号(如 SIGINT, SIGTERM),这样当用户按下 Ctrl+C 或 K8s 停止容器时,程序能执行清理逻辑。
// 创建一个信号接收通道
sigChan := make(chan os.Signal, 1)
// 监听中断和终止信号
signal.Notify(sigChan, syscall.SIGINT, syscall.SIGTERM)
// 阻塞在这里,直到收到信号
sig := <-sigChan
logger.Info("接收到退出信号,准备优雅关闭...", sig)
// 在这里执行具体的清理逻辑,例如:
// s.Stop()
// db.Close()2. 使用 Context 控制(适合组件级代码)
如果你的 Serve 方法被设计为可以从外部关闭,使用 context.Context 是 Go 标准库最推荐的做法。
func (s *Server) Serve(ctx context.Context) error {
// ... 前面的启动逻辑 ...
// 阻塞直到 ctx 被 cancel
<-ctx.Done()
logger.Info("Context 已取消,正在停止服务")
return s.Stop() // 假设你有一个 Stop 方法
}3. 等待 WaitGroup(适合多协程常驻服务)
如果你启动了多个后台任务,可以使用 sync.WaitGroup 等待它们全部完成。
var wg sync.WaitGroup
wg.Add(1)
go func() {
defer wg.Done()
s.startBackgroundJob()
}()
// 代替 select{}
wg.Wait() 为什么 select {} 不够优雅?
- 无法回收资源:由于它永远阻塞,后面的代码(如关闭数据库连接、清理临时文件)永远不会被执行。
- 硬终止风险:在容器化环境(如 Docker/K8s)中,如果程序不响应
SIGTERM信号,系统会在等待一段时间后强制发送SIGKILL,这可能导致数据写入一半出现损坏。 - 调试困难:在堆栈打印中看到
select {}往往很难判断程序是正常运行中还是死锁了。
总结
你截图中 348行到353行 增加的代码正是最标准的改进方式:
- Replace: 将 R332 的
select {}替换为。 - Logic: 使用
signal.Notify配合<-sigChan。 - Outcome: 这样程序就从“死循环”变成了“待命状态”,并且具备了被管理的能力。
建议: 如果你的项目中有统一的任务管理框架,优先看是否有全局的 StopChannel 或 Context 可以直接使用。
There was a problem hiding this comment.
chatgpt:根据注释,select{} 用于无限期阻塞,等待中断信号。不过,更优雅的做法是确保仅在必要时才使用 select{},并且要妥善处理优雅关闭的逻辑。
注释中提到,如果 graceful_shutdown 包已经在内部处理信号,我们应该检查并确保仅在适当的情况下阻塞关闭。具体来说,可以将 select{} 替换为更加有控制的逻辑。
你可以按照以下方式重构代码:
// 检查 graceful_shutdown 包是否在内部处理信号
if !gracefulShutdownPkg.Init() {
// 如果 InternalSignal 为 false,表示我们需要自己处理信号并调用清理函数。
// 只在必要时阻塞,例如在清理或等待中断信号时。
sigChan := make(chan os.Signal, 1)
signal.Notify(sigChan, syscall.SIGINT, syscall.SIGTERM)
// 阻塞,直到接收到信号。
sig := <-sigChan
logger.Info("Received signal: ", sig)
// 在接收到信号后,进行优雅关闭或清理
gracefulShutdownPkg.InternalSignal()
} else {
// 如果 graceful_shutdown 包已经在处理信号,我们就让它继续处理关闭
select {} // 阻塞,允许优雅关闭继续进行
}主要变化:
- 检查
graceful_shutdown.Init(): 确认该包是否已经处理了信号。 - 手动处理信号: 如果包没有处理信号,则我们手动设置信号监听器(
sigChan),并仅在必要时阻塞(等待信号)。 - 适当使用
select{}: 如果graceful_shutdown包已经在处理信号,我们才使用select{}阻塞,以允许包完成关闭过程。
这种方式可以避免不必要的阻塞,更好地控制关闭和清理的流程。
AlexStocks
left a comment
There was a problem hiding this comment.
改动方向正确:从 select{} 死等改为可控 shutdown 流程,并且处理了 InternalSignal/manual 两种模式。
建议补一条策略说明:graceful_shutdown/shutdown.go 里对 panic 的 recover 目前比较宽,测试环境友好,但也可能掩盖真实生产问题。建议按环境区分或增加更强告警信号。
| // If InternalSignal is true (default), graceful_shutdown.Init() already set up signal handling | ||
| // and will call os.Exit(0) after cleanup, so we just block here. | ||
| // If InternalSignal is false, we need to handle signals ourselves and call cleanup. | ||
| if s.cfg.Shutdown != nil && s.cfg.Shutdown.InternalSignal != nil && *s.cfg.Shutdown.InternalSignal { |
There was a problem hiding this comment.
InternalSignal 默认值是 true(global/shutdown_config.go 中 default:"true"),defaultServerOptions() 通过 DefaultShutdownConfig() 初始化,所以默认路径直接走 select{},和原来没区别。
真正有变化的路径(InternalSignal=false)才走到下面的信号注册逻辑,但这不是默认场景,并没有解决 issue #3042 的挂死问题。
| graceful_shutdown.BeforeShutdown(s.cfg.Shutdown) | ||
| } | ||
|
|
||
| return nil |
There was a problem hiding this comment.
graceful_shutdown/shutdown.go 原实现在 beforeShutdown 完成后调用 os.Exit(0)。这里改成 return nil,把控制权交回调用方,但 protocols 已经被销毁,进程还在运行,后续请求会 panic 或行为不可预期。
另外,对 SIGQUIT 等需要 dump heap 的信号,shutdown.go:L98-102 有专门处理,这里完全跳过了。
建议:BeforeShutdown 完成后调用 os.Exit(0),并对 dump 信号特殊处理。
| // Listen for interrupt signals to enable graceful shutdown | ||
| // This replaces the previous select{} which blocked indefinitely and did not provide a way to gracefully shut down or clean up resources when the process received a signal | ||
| sigChan := make(chan os.Signal, 1) | ||
| signal.Notify(sigChan, os.Interrupt, syscall.SIGTERM, syscall.SIGINT) |
There was a problem hiding this comment.
graceful_shutdown 包(graceful_shutdown_signal_linux.go)注册了 12 个信号,包含 SIGHUP、SIGQUIT 等。这里只注册了 3 个(os.Interrupt 和 SIGINT 在 Linux 上是同一个信号),容器/K8s 常用的 SIGHUP 收到后会直接异常终止,不执行清理。
建议:直接使用 graceful_shutdown.ShutdownSignals,不要硬编码。
| }() | ||
| registryProtocol := extension.GetProtocol(constant.RegistryProtocol) | ||
| registryProtocol.Destroy() | ||
| if registryProtocol != nil { |
There was a problem hiding this comment.
extension.GetProtocol(common/extension/protocol.go)未注册时直接 panic,注册时返回具体类型的接口值,永远不会是 nil 接口值。这个 if registryProtocol \!= nil 判断是死代码,会误导读者以为 GetProtocol 能合法返回 nil。destroyProtocols 里同样的问题。
建议:删掉 nil 判断,仅保留 recover 即可。
| return err | ||
| } | ||
| select {} | ||
|
|
There was a problem hiding this comment.
Serve() 全程持有写锁(通过 defer s.mu.Unlock()),但函数内部进入 select{} 永久阻塞,锁永不释放。这会饿死所有调用 GetServiceOptions()、GetServiceInfo()、GetRPCService() 等方法的读操作,健康检查、动态路由等都会死锁。
建议:serve = true 的赋值在锁内完成后立即释放锁,再进入阻塞等待。
| // - Destroying protocols | ||
| // - Executing custom shutdown callbacks | ||
| // This function can be called manually when InternalSignal is disabled. | ||
| func BeforeShutdown(shutdown *global.ShutdownConfig) { |
There was a problem hiding this comment.
原来 beforeShutdown 是私有函数,只在 Init 的 goroutine 里调用一次。导出为 BeforeShutdown 后,server.go 可以直接调用,而 Init 内部的 goroutine(InternalSignal=true 时)也会调用。两次 destroyRegistries() + destroyProtocols() 会对已销毁的资源重复操作(double free 语义),行为未定义。虽然 server.go 靠 InternalSignal 判断来规避,但这依赖调用者自律。
建议:加 sync.Once 保证无论被谁调用多少次,只执行一次。



fix: resolved the issue of the Serve() method causing the program to hang indefinitely through graceful_shutdown #3042