Skip to content

[DO NOT MERGE] 0.23.1 + StackRox patches#97

Draft
Stringy wants to merge 25 commits intoupstream-mainfrom
0.23.1-stackrox-rc1
Draft

[DO NOT MERGE] 0.23.1 + StackRox patches#97
Stringy wants to merge 25 commits intoupstream-mainfrom
0.23.1-stackrox-rc1

Conversation

@Stringy
Copy link
Collaborator

@Stringy Stringy commented Feb 26, 2026

Primary new fixes in this update:

BPF verifier fixes (4 files)

  • syscall_exit.bpf.c: Refactored sys_exit and sampling_logic_exit() to do a single maps__get_capture_settings() lookup instead of three separate inlined map lookups (maps__get_dropping_mode(), maps__get_sampling_ratio(), maps__get_drop_failed()). Clang was optimizing away null checks on
    repeated lookups of the same map, causing R0 invalid mem access 'map_value_or_null' on kernels < 6.17.
  • toctou_mitigation.h: Same pattern — toctou_mitigation__sampling_logic_enter() now does a single maps__get_capture_settings() lookup and accesses settings->dropping_mode / settings->sampling_ratio directly.
  • extract_from_kernel.h: Replaced READ_TASK_FIELD_INTO with BPF_CORE_READ_INTO for all task->cred reads (5 call sites). Fixes BPF verifier rejection on COS (clang-compiled kernel with RCU pointer annotations on credential structs).
  • execve.bpf.c / execveat.bpf.c: Wrapped t1_execve_x, t2_execve_x, t1_execveat_x, t2_execveat_x bodies in #if 0. These tail-call programs are unreachable (execve's main program returns early for both success and failure) or unused (collector doesn't subscribe to execveat). Stubbing them
    frees instruction budget, making the volatile fix on push__bytebuf safe without exceeding the 1M instruction limit.

TOCTOU program disable (1 file)

  • lifecycle.c: Added a loop before the existing ia32 TOCTOU disable logic that checks whether each 64-bit TOCTOU mitigation program's syscall tracepoint exists on the running kernel. Checks /sys/kernel/tracing/events/syscalls/sys_enter_ with debugfs fallback. If the tracepoint
    doesn't exist (e.g., openat2 on kernels < 5.6), disables autoloading to prevent CO-RE relocation failures for missing BTF types like struct open_how.

Exepath resolution (1 file)

  • parsers.cpp: Added fallback in parse_execve_exit for when enter events are unavailable (modern BPF marks them EF_OLD_VERSION). Uses Parameter 31 (bprm->filename) from the exit event. Filters out fd-based execs (/dev/fd/N, /proc/self/fd/N) which are intermediate container runtime steps
    (fexecve) — without this filter, phantom process signals appear with wrong name/path. Also fixed m_len → len() accessor in parse_pidfd_getfd_exit.

Build fix (1 file)

  • CMakeLists.txt: Fixed CMake dependency cycle by compiling event_table.c, flags_table.c, dynamic_params_table.c directly into the events_dimensions_generator executable instead of linking scap_event_schema. CMake 3.31+ (used by the collector builder) enforces cycle detection that
    upstream's CI (CMake 3.22) doesn't hit.

Assert macro fix (1 file)

  • sinsp_public.h: Replaced libsinsp_logger()->format() in ASSERT_TO_LOG with falcosecurity_log_fn callback from scap_log.h. Avoids circular include dependency (logger.h → sinsp_public.h). Falls back to assert() if the callback isn't set.

fremmi and others added 24 commits December 22, 2025 15:30
…ing due to integer overflow

Add validation in ppm_cmsg_nxthdr to ensure cmsg_aligned_len is at least
sizeof(ppm_cmsghdr) after alignment calculation. This prevents an infinite
loop when malformed ancillary data contains cmsg_len = 0xFFFFFFFFFFFFFFFF,
which causes integer overflow in PPM_CMSG_ALIGN macro, resulting in
cmsg_aligned_len = 0 and preventing forward progress in the loop.

Signed-off-by: Francesco Emmi <francesco.emmi@sysdig.com>
Guard against invalid `cmsg_len` values while accessing control
messages in ancillary data. This is achieved by checking there is
enough space between the current control message and the end of the
buffer to hold both the current control message and the next one.

This change sync the implementation of `ppm_cmsg_nxthdr()` with the
current glibc implementation:
https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/cmsg_nxthdr.c;h=0e602a16053ed6742ea1556d75de8540e49157f1;hb=170550da27f68a08589e91b541883dcc58dee640

Signed-off-by: Leonardo Di Giovanna <leonardodigiovanna1@gmail.com>
Signed-off-by: Roberto Scolaro <roberto.scolaro21@gmail.com>
Signed-off-by: irozzo-1A <iacopo@sysdig.com>
Signed-off-by: Leonardo Di Giovanna <leonardodigiovanna1@gmail.com>
Optimization for scanning, filter out those file descriptors that are
not socket fds.
Introduce the interesting_subsys set to configure which cgroup
subsystems are going to be considered in set_cgroups.
Signed-off-by: Matthew Knight <matthew.knight@sysdig.com>
Signed-off-by: Matthew Knight <matthew.knight@sysdig.com>
Signed-off-by: Afsan Hossain <84701952+mdafsanhossain@users.noreply.github.com>
Allow to log ASSERT failure instead of hard stopping on it. The change
also enables ASSERT logging in the Release mode on debug logging level
to give more information when troubleshooting.
Experiment with disabling trusted exepath to verify stackrox tests
There is an unexpected difference betweek kernels compiled with gcc and
clang. The former one doesn't support rcu attributes yet, meaning that
READ_TASK_FIELD_INFO on task->cred produces PTR_TO_BTF_ID. The latter
one does support rcu attribute, and READ_TASK_FIELD_INFO ends up with
PTR_TO_BTF_ID | MEM_RCU | MAYBE_NULL [1]. COS seems to be using clang to
compile the kernel:

    # from /boot/config
    CONFIG_CC_VERSION_TEXT="Chromium OS 16.0_pre484197_p20230405-r12 clang version 16.0.0
    (/var/tmp/portage/sys-devel/llvm-16.0_pre484197_p20230405-r12/work/llvm-16.0_pre484197_p20230405/clang 2916b99182752b1aece8cc4479d8d6a20b5e02da)"

The verifier doesn't like the
null part, and to fix it we need to break a chain of reading and verify
that the cred structure is not null.

[1]: https://lore.kernel.org/bpf/20230228040121.94253-3-alexei.starovoitov@gmail.com/
Along the way silence compiler warnings about task_cred
* Adapt the modern probe to clang 21

The code generated by clang 21 is more 'complex' and reaches
1000000 instructions on execve().

* Force casting size(r2) parameter to bpf_probe_read_user()

...otherwise, the compiler thinks that the calling convention allows
to optimize unsigned truncation, a the verifier disagrees.

* Decrease MAX_IOVCNT to satisfy the verifier on rhel
@Stringy Stringy changed the title [DO NOT MERGED] 0.23.1 + StackRox patches [DO NOT MERGE] 0.23.1 + StackRox patches Feb 27, 2026
@Stringy Stringy force-pushed the 0.23.1-stackrox-rc1 branch 5 times, most recently from 54cea3e to 00c9559 Compare March 13, 2026 14:52
  - Fix BPF verifier failures on older kernels (4.18)
  - Use volatile on push__bytebuf len_to_read for 32-bit bounds
  - Stub out t1/t2_execveat_x (not subscribed by collector)
  - Use BPF_CORE_READ_INTO for cred reads
  - Disable TOCTOU 64-bit progs for missing syscalls (e.g. openat2)
  - Skip fd-based execs (/dev/fd/N) in exepath fallback
  - Fix exepath resolution without enter events (bprm->filename)

  Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants