bugfix: prevent uthread crash by checking coroutine reference before …#398
bugfix: prevent uthread crash by checking coroutine reference before …#398zhuizhuhaomeng merged 2 commits intoopenresty:masterfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR fixes a crash bug in stream-lua-nginx-module where the ctx->uthreads counter could be incorrectly decremented when a user thread (uthread) that was already killed via ngx.thread.kill terminates through the normal coroutine completion path. This happens specifically when a uthread contains a child coroutine created via coroutine.create (which doesn't get a co_ref in the Lua registry). The fix adds a co_ref != LUA_NOREF guard before calling ngx_stream_lua_del_thread and decrementing ctx->uthreads, preventing double-decrements that could lead to phantom uthreads count underflow and subsequent crashes.
Changes:
- Added
co_ref != LUA_NOREFguard in three locations inngx_stream_lua_run_thread()to prevent phantomuthreadsdecrement when a thread's coroutine reference has already been cleaned up - Replaced all 6 existing test cases in
t/127-uthread-kill.twith a single new test case that reproduces the specific bug scenario using Redis connections and child coroutines
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| src/ngx_stream_lua_util.c | Guards three del_thread/uthreads-- code paths with co_ref != LUA_NOREF checks to prevent double-decrement |
| t/127-uthread-kill.t | Removes TEST 1-6 (existing uthread kill tests) and adds TEST 7 to reproduce the phantom uthreads decrement bug |
Comments suppressed due to low confidence (3)
t/127-uthread-kill.t:25
- This PR removes TEST 1 through TEST 6 which covered critical
ngx.thread.killscenarios (killing pending sleep, already-waited threads, pending resolver, pending connect, already-terminated threads, and kill-self). These are important regression tests that should be preserved. The new TEST 7 only covers the specific bug being fixed (phantom uthreads decrement) and does not replace the coverage of the removed tests. Please keep the existing tests and add TEST 7 as an additional test case.
function f()
t/127-uthread-kill.t:25
- The test is numbered TEST 7 but it is the only test in the file (tests 1-6 were removed). If the intent is to keep only this test, it should be renumbered to TEST 1 for consistency. If tests 1-6 are restored (recommended), then TEST 7 is the correct number.
function f()
t/127-uthread-kill.t:29
- The variable is named
dns_threadsbut these threads are performing Redis operations, not DNS lookups. This makes the test harder to understand. Consider renaming to something more descriptive likeworker_threadsorredis_threads.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| $ENV{TEST_NGINX_RESOLVER} ||= '8.8.8.8'; | ||
| $ENV{TEST_NGINX_MEMCACHED_PORT} ||= '11211'; | ||
| $ENV{TEST_NGINX_REDIS_PORT} ||= '6379'; |
There was a problem hiding this comment.
The test plan formula blocks() * 5 + 1 was designed for the original 6 test blocks, each of which had stap_out, stream_response, no_error_log, and error_log sections. The new single test block (TEST 7) only has stream_response and no_error_log — it lacks stap_out and error_log sections. The multiplier 5 is now incorrect and will cause a test plan mismatch, leading to test failure. The plan should be updated to match the actual test sections present in the remaining test blocks. If the existing TEST 1-6 are restored (as they should be), the plan would need to account for both old and new test structures.
…deletion