Skip to content

fix: resolve flaky test_resource_change_notifier and Humble demo node SIGSEGV#345

Open
bburda wants to merge 2 commits intomainfrom
fix/344-flaky-tests
Open

fix: resolve flaky test_resource_change_notifier and Humble demo node SIGSEGV#345
bburda wants to merge 2 commits intomainfrom
fix/344-flaky-tests

Conversation

@bburda
Copy link
Copy Markdown
Collaborator

@bburda bburda commented Apr 3, 2026

Pull Request

Summary

Fix two flaky CI failures on main:

  1. test_resource_change_notifier timeout (Rolling): Reorder variable declarations in all 16 tests so ResourceChangeNotifier is declared after shared state (promises, atomics). C++ destroys locals in reverse declaration order, so the notifier's worker thread is now shut down before any captured variable is destroyed - preventing use-after-free when wait_for times out under CI load.

  2. test_hybrid_suppression SIGSEGV on Humble: Fix demo node destructors to reset subscriptions and timers before implicit member destruction, and restructure main() to explicitly destroy the node before rclcpp::shutdown(). This reduces the SIGINT teardown race window that caused SIGSEGV (exit -11) on Humble.


Issue


Type

  • Bug fix
  • New feature or tests
  • Breaking change
  • Documentation only

Testing

  • All 2413 unit tests pass locally on Jazzy
  • test_resource_change_notifier specifically passes (16/16 tests)
  • Demo nodes compile and link correctly
  • Pre-commit hooks pass (clang-format, ament-copyright, etc.)
  • Humble SIGSEGV fix requires CI verification (not reproducible locally on Jazzy)

Checklist

  • Breaking changes are clearly described (and announced in docs / changelog if needed)
  • Tests were added or updated if needed
  • Docs were updated if behavior or public API changed

…n demo nodes

Fix two flaky CI failures (#344):

1. test_resource_change_notifier: reorder declarations in all 16 tests
   so ResourceChangeNotifier is declared after shared state (promises,
   atomics). C++ destroys locals in reverse order, so the notifier's
   worker thread is now shut down before any captured variable is
   destroyed - preventing use-after-free when wait_for times out
   under CI load.

2. test_hybrid_suppression on Humble: fix demo node destructors to
   reset subscriptions and timers before implicit member destruction,
   and restructure main() to explicitly destroy the node before
   rclcpp::shutdown(). This reduces the SIGINT teardown race window
   that caused SIGSEGV (exit -11) on Humble.

Closes #344
Copilot AI review requested due to automatic review settings April 3, 2026 08:50
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes two CI flakes affecting ros2_medkit’s gateway unit tests (Rolling) and integration-test demo nodes (Humble), improving shutdown/teardown safety around background threads and ROS 2 node destruction.

Changes:

  • Reorders local declarations in test_resource_change_notifier so ResourceChangeNotifier is destroyed before callback-captured shared state, preventing use-after-free hangs on timeouts.
  • Hardens integration-test demo nodes’ shutdown by explicitly destroying the node before rclcpp::shutdown() and resetting timers/subscriptions in destructors to reduce teardown races/SIGSEGVs on Humble.

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
src/ros2_medkit_integration_tests/demo_nodes/rpm_sensor.cpp Reset timer in destructor; ensure node is destroyed before rclcpp::shutdown().
src/ros2_medkit_integration_tests/demo_nodes/param_beacon_node.cpp Ensure node is destroyed before rclcpp::shutdown().
src/ros2_medkit_integration_tests/demo_nodes/light_controller.cpp Reset subscription/timer in destructor; ensure node is destroyed before rclcpp::shutdown().
src/ros2_medkit_integration_tests/demo_nodes/lidar_sensor.cpp Reset timers in destructor; ensure node is destroyed before rclcpp::shutdown().
src/ros2_medkit_integration_tests/demo_nodes/engine_temp_sensor.cpp Reset timer in destructor; ensure node is destroyed before rclcpp::shutdown().
src/ros2_medkit_integration_tests/demo_nodes/door_status_sensor.cpp Reset timer in destructor; ensure node is destroyed before rclcpp::shutdown().
src/ros2_medkit_integration_tests/demo_nodes/calibration_service.cpp Ensure node is destroyed before rclcpp::shutdown().
src/ros2_medkit_integration_tests/demo_nodes/brake_pressure_sensor.cpp Reset timer in destructor; ensure node is destroyed before rclcpp::shutdown().
src/ros2_medkit_integration_tests/demo_nodes/brake_actuator.cpp Reset subscription/timer in destructor; ensure node is destroyed before rclcpp::shutdown().
src/ros2_medkit_integration_tests/demo_nodes/beacon_publisher.cpp Add destructor to cancel/reset timer; ensure node is destroyed before rclcpp::shutdown().
src/ros2_medkit_gateway/test/test_resource_change_notifier.cpp Reorder declarations across tests so notifier teardown joins worker thread before shared state is destroyed.

@bburda bburda requested a review from mfaferek93 April 3, 2026 10:02
@bburda bburda self-assigned this Apr 3, 2026
Avoid relying on transitive includes which can break across
standard library or dependency changes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] Flaky CI: test_resource_change_notifier timeout + test_hybrid_suppression SIGSEGV on Humble

3 participants