Fix Process resource leak in system metrics collection#17212
Fix Process resource leak in system metrics collection#17212PDGGK wants to merge 1 commit intoapache:masterfrom
Conversation
Add process.waitFor() and process.destroyForcibly() in finally block to prevent zombie processes when collecting system memory and network metrics via Runtime.exec(). Also handle InterruptedException properly by restoring the interrupt flag. Affected files: - SystemMetrics: memory info collection via 'free' command - LinuxNetMetricManager: socket count collection via 'ss' command
There was a problem hiding this comment.
Pull request overview
This pull request fixes process resource leaks in Linux system metrics collection by adding proper process lifecycle management. The changes ensure that child processes spawned via Runtime.exec() are properly waited for and cleaned up, preventing zombie processes and file descriptor leaks that could accumulate over time in periodic metrics collection.
Changes:
- Added
process.waitFor()calls to reap child processes after reading their output - Added
finallyblocks withprocess.destroyForcibly()to ensure cleanup on exceptions - Added proper
InterruptedExceptionhandling with interrupt flag restoration
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| SystemMetrics.java | Fixed process resource leak in Linux memory metrics collection by adding waitFor(), destroyForcibly(), and InterruptedException handling |
| LinuxNetMetricManager.java | Fixed process resource leak in socket count metrics collection by adding waitFor(), destroyForcibly(), and InterruptedException handling |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| logger.debug("Interrupted while waiting for memory command", e); | ||
| Thread.currentThread().interrupt(); | ||
| } finally { | ||
| if (process != null) { |
There was a problem hiding this comment.
Calling destroyForcibly() unconditionally in the finally block can forcibly terminate a process that has already completed successfully via waitFor(). This is problematic because:
- After waitFor() returns successfully, the process has already exited normally, and destroyForcibly() should not be needed
- destroyForcibly() may interfere with proper cleanup of process resources on some platforms
- This pattern goes against Java best practices for process management
The fix should only call destroyForcibly() when the process hasn't completed normally (i.e., when an exception occurs before or during waitFor()). Consider adding a boolean flag to track whether waitFor() completed successfully, or only destroy the process in the catch blocks.
| if (process != null) { | |
| if (process != null && process.isAlive()) { |
| } finally { | ||
| if (process != null) { | ||
| process.destroyForcibly(); | ||
| } |
There was a problem hiding this comment.
Calling destroyForcibly() unconditionally in the finally block can forcibly terminate a process that has already completed successfully via waitFor(). This is problematic because:
- After waitFor() returns successfully, the process has already exited normally, and destroyForcibly() should not be needed
- destroyForcibly() may interfere with proper cleanup of process resources on some platforms
- This pattern goes against Java best practices for process management
The fix should only call destroyForcibly() when the process hasn't completed normally (i.e., when an exception occurs before or during waitFor()). Consider adding a boolean flag to track whether waitFor() completed successfully, or only destroy the process in the catch blocks.
| process.waitFor(); | ||
| this.connectionNum = Integer.parseInt(result.toString().trim()); | ||
| } catch (IOException e) { | ||
| LOGGER.error("Failed to get socket num", e); |
There was a problem hiding this comment.
The Integer.parseInt() call can throw NumberFormatException if the command output is not a valid integer, but this exception is not caught. If the external command returns unexpected output, this will cause an unhandled exception that could disrupt metrics collection. Consider adding a catch block for NumberFormatException to handle malformed output gracefully, similar to how IOException is handled.
| LOGGER.error("Failed to get socket num", e); | |
| LOGGER.error("Failed to get socket num", e); | |
| } catch (NumberFormatException e) { | |
| LOGGER.error( | |
| "Failed to parse socket num from command output: '{}'", | |
| e.getMessage()); |
Summary
process.waitFor()andprocess.destroyForcibly()infinallyblock to prevent zombie processes in periodic metrics collectionInterruptedExceptionproperly by restoring the interrupt flagSystemMetrics(memory) andLinuxNetMetricManager(socket count)Fixes #17211
Test plan