Skip to content

initdb: reset and preserve errno when scanning cdb_init.d#1589

Open
tuhaihe wants to merge 1 commit intoapache:mainfrom
tuhaihe:fix-initdb
Open

initdb: reset and preserve errno when scanning cdb_init.d#1589
tuhaihe wants to merge 1 commit intoapache:mainfrom
tuhaihe:fix-initdb

Conversation

@tuhaihe
Copy link
Member

@tuhaihe tuhaihe commented Mar 2, 2026

initdb's setup_cdb_schema() checked errno after readdir() iteration, but did not clear errno before entering the loop.

On some environments (observed on Ubuntu 24.04), a stale errno value can survive into this check and be misinterpreted as a readdir failure, causing:
error while reading cdb_init.d directory: Function not implemented

Fix by:

  • setting errno=0 before the readdir() loop
  • storing readdir errno immediately after the loop
  • handling closedir() errors separately
  • using the saved readdir errno for the post-loop error check

This prevents false-positive failures during initdb while preserving proper I/O error reporting.

Fixes #ISSUE_Number

What does this PR do?

Type of Change

  • Bug fix (non-breaking change)
  • New feature (non-breaking change)
  • Breaking change (fix or feature with breaking changes)
  • Documentation update

Breaking Changes

Test Plan

  • Unit tests added/updated
  • Integration tests added/updated
  • Passed make installcheck
  • Passed make -C src/test installcheck-cbdb-parallel

Impact

Performance:

User-facing changes:

Dependencies:

Checklist

Additional Context

CI Skip Instructions


Copy link
Contributor

@leborchuk leborchuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

setup_cdb_schema() checked errno after a readdir() loop without resetting
it beforehand. In some environments (e.g., Ubuntu 24.04), a stale errno
value from operations inside the loop (such as pg_realloc or pg_strdup)
could persist, causing readdir's normal termination to be misinterpreted
as a failure (e.g., "Function not implemented").

This commit fixes the issue by adopting the standard PostgreSQL idiom:
- Use "while (errno = 0, (file = readdir(dir)) != NULL)" to ensure errno
  is cleared strictly before each readdir() call.
- Move closedir() after the errno check to prevent it from overwriting
  the error code from readdir().
- Add defensive error checking for the closedir() call itself.

This ensures robust directory scanning and reliable error reporting
during cluster initialization.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants