add OTEL tracing + make targets + scripts#139
Conversation
There was a problem hiding this comment.
Pull request overview
Adds OpenTelemetry (OTel) tracing across the VM/container startup path, including relaying VM-originated spans back to the host, plus developer tooling (Jaeger + Mermaid Gantt) and an integration test to visualize traces.
Changes:
- Introduce a VM-side TTRPC traces service + channel-backed OTel span exporter to stream spans out of the VM.
- Instrument key host/shim + VM code paths with spans and propagate context over TTRPC.
- Add tracing integration test, Jaeger/Makefile helpers, Mermaid conversion script, and tracing documentation.
Reviewed changes
Copilot reviewed 20 out of 21 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| plugins/vminit/tracing/plugin.go | Registers VM tracing plugin and installs OTel tracer provider/exporter. |
| internal/vminit/tracing/service.go | TTRPC streaming service that forwards exported spans to the host. |
| internal/vminit/tracing/exporter.go | Channel-backed OTel SpanExporter implementation for VM span buffering. |
| internal/vminit/task/service.go | Adds VM-side spans around Create/Start and networking wait/connect. |
| internal/vminit/runc/container.go | Adds VM-side spans around container creation and mount setup. |
| internal/vminit/process/init.go | Adds VM-side spans around crun create/start operations. |
| internal/vminit/ctrnetworking/ctrnetworking.go | Adds VM-side spans per-network setup within errgroup goroutines. |
| internal/vm/libkrun/instance.go | Adds host-side spans for VMStart/WaitForTTRPC + TTRPC client interceptor. |
| internal/shim/task/tracing.go | Implements host-side relay of VM span stream to an OTLP HTTP exporter. |
| internal/shim/task/service.go | Adds shim spans and starts VM trace relay during Create. |
| integration/tracing_test.go | Build-tagged integration test that exports spans to Jaeger and relays VM spans. |
| hack/trace-to-mermaid.py | Fetches latest Jaeger trace and renders it as a Mermaid Gantt chart. |
| docs/tracing.md | Documentation for running tracing workflow with Jaeger + Mermaid. |
| cmd/vminitd/main.go | Ensures tracing plugin is loaded by vminitd. |
| api/proto/nerdbox/services/traces/v1/traces.proto | New Traces service + Span/KeyValue messages for VM→host span streaming. |
| api/services/traces/v1/traces.pb.go | Generated protobuf types for traces service. |
| api/services/traces/v1/traces_ttrpc.pb.go | Generated TTRPC bindings for traces service. |
| api/services/traces/v1/doc.go | Package doc for traces API. |
| Makefile | Adds test-tracing, Jaeger lifecycle, UI open, and Gantt generation targets. |
| go.mod | Adds OTel SDK + OTLP HTTP exporter dependencies. |
| go.sum | Records new module checksums for added dependencies. |
Comments suppressed due to low confidence (1)
internal/vm/libkrun/instance.go:337
- This wait loop has multiple
returnpaths before the explicitttrpcWaitSpan.End()call, so the WaitForTTRPC span may never be ended/exported. Use adefer ttrpcWaitSpan.End()right after starting the span, or restructure to end the span before each early return.
_, ttrpcWaitSpan := tracer.Start(ctx, "libkrun.WaitForTTRPC")
startedAt := time.Now()
for {
select {
case err := <-errC:
if err != nil {
return fmt.Errorf("failure running vm: %w", err)
}
case <-ctx.Done():
return ctx.Err()
case <-time.After(time.Millisecond):
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
a7fd67c to
830e59d
Compare
830e59d to
ca1697e
Compare
ca1697e to
f06a31f
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 21 out of 22 changed files in this pull request and generated 2 comments.
Comments suppressed due to low confidence (1)
internal/vm/libkrun/instance.go:341
- Spans started for VMStart/WaitForTTRPC are only ended on the success path. If Start() returns early (errC error, ctx.Done(), or vmStartTimeout), these spans never end, which produces incomplete traces and can leak span state. Consider deferring span.End() immediately after each tracer.Start (or using a single parent span with defer) so all exit paths close the spans.
tracer := otel.Tracer("nerdbox")
_, krunStartSpan := tracer.Start(ctx, "libkrun.VMStart")
errC := make(chan error)
go func() {
defer close(errC)
if err := v.vmc.Start(); err != nil {
errC <- err
}
}()
v.shutdownCallbacks = []func(context.Context) error{
func(context.Context) error {
cerr := v.vmc.Shutdown()
select {
case err := <-errC:
if err != nil {
return fmt.Errorf("failure running vm: %w", err)
}
default:
}
return cerr
},
}
var conn net.Conn
// Initial TTRPC ping deadline. On Windows, the vsock listen-mode proxy
// has more overhead (host→Unix socket→vsock→guest→vsock→Unix socket→host)
// so we start with a longer deadline.
d := 2 * time.Millisecond
if runtime.GOOS == "windows" {
d = 500 * time.Millisecond
}
_, ttrpcWaitSpan := tracer.Start(ctx, "libkrun.WaitForTTRPC")
startedAt := time.Now()
for {
select {
case err := <-errC:
if err != nil {
return fmt.Errorf("failure running vm: %w", err)
}
case <-ctx.Done():
return ctx.Err()
case <-time.After(time.Millisecond):
}
if time.Since(startedAt) > vmStartTimeout {
log.G(ctx).WithField("timeout", vmStartTimeout).Warn("Timeout while waiting for VM to start")
return fmt.Errorf("VM did not start within %s", vmStartTimeout)
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
f06a31f to
b45ef30
Compare
b45ef30 to
d5c5c51
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 21 out of 22 changed files in this pull request and generated 5 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
d5c5c51 to
f70b6f4
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 21 out of 22 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
7087d9a to
415b643
Compare
415b643 to
378cb97
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 21 out of 22 changed files in this pull request and generated 3 comments.
Comments suppressed due to low confidence (1)
internal/vm/libkrun/instance.go:304
errCis unbuffered and the VM start goroutine does a blockingerrC <- err. IfStartreturns early (e.g.,ctx.Done()or timeout) before receiving fromerrC, that goroutine can deadlock forever on send (and never reachclose(errC)), leaking the goroutine and potentially hanging shutdown logic that expects the channel to close. MakeerrCbuffered (size 1) and/or make the send non-blocking so errors can't block when the caller has already returned.
errC := make(chan error)
go func() {
defer close(errC)
if err := v.vmc.Start(); err != nil {
errC <- err
}
}()
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Add OpenTelemetry instrumentation to trace VM boot latency: - Add libkrun.VMStart and libkrun.WaitForTTRPC spans to instance.go Start() - Add otelttrpc.UnaryClientInterceptor to the TTRPC client - Add build-tagged (//go:build tracing) integration test TestTraceVMBoot - Add Makefile targets: jaeger-start, jaeger-stop, test-tracing, jaeger-open, jaeger-gantt - Add hack/trace-to-mermaid.py to render Jaeger traces as Mermaid gantt charts - Add docs/tracing.md with usage instructions Signed-off-by: David Scott <dave@recoil.org>
Add a Traces.Stream ttrpc service that relays OTel spans from inside the VM back to the host, mirroring the existing Events.Stream pattern. VM-side: - Channel-backed SpanExporter buffers spans (cap 1024, silent drop) - TracerProvider with 100ms batch interval drains into the channel - Traces ttrpc service streams spans to the host on demand - Spans added to: task.Create/Start, runc.NewContainer, crun.create/ start, ctrnetworking.Connect, mountutil.All Host-side (shim): - After VM boot, subscribes to Traces.Stream - Converts proto spans to ReadOnlySpan via tracetest.SpanStub.Snapshot - Corrects VM-vs-host clock skew (RTC has only second-level resolution) using host boot time as sync point - Re-exports via OTLP HTTP with service.name=nerdbox-vm - Spans added to: shim.BundleLoad, shim.VMBoot, shim.BundleTransfer, shim.IOSetup, shim.TaskCreate, shim.Start The relay logic lives in internal/tracing so it can be reused by integration tests without duplication. Integration test extended to exercise Task.Create and Task.Start, producing a 24-span trace visible in Jaeger. Sample trace from TestTraceVMBoot (24 spans, ~80ms total): nerdbox VM.NewInstance+Start........................ 75.3ms nerdbox . libkrun.VMStart........................... 72.4ms nerdbox . libkrun.WaitForTTRPC...................... 72.4ms nerdbox . TTRPC.System.Info......................... 0.2ms nerdbox . . System/Info (interceptor)............... 0.2ms nerdbox . TTRPC.Bundle.Create....................... 0.2ms nerdbox . . Bundle/Create (interceptor)............. 0.2ms nerdbox . TTRPC.Task.Create......................... 1.9ms nerdbox . . Task/Create (interceptor)............... 1.8ms nerdbox-vm . . . task.Create........................... 1.7ms nerdbox-vm . . . . runc.NewContainer................... 1.5ms nerdbox-vm . . . . . mountutil.All..................... 0.0ms nerdbox-vm . . . . . crun.create....................... 1.5ms nerdbox-vm . . . . ctrnetworking.Connect............... 0.0ms nerdbox . TTRPC.Task.Start.......................... 1.3ms nerdbox . . Task/Start (interceptor)................ 1.3ms nerdbox-vm . . . task.Start............................ 1.2ms nerdbox-vm . . . . waitForCtrNetConnect................ 0.0ms nerdbox-vm . . . . container.Start..................... 1.2ms nerdbox-vm . . . . . crun.start........................ 1.2ms Signed-off-by: David Scott <dave@recoil.org> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
378cb97 to
e82db3d
Compare
WithEndpointURL expects a full URL (http://host:port) but OTEL_EXPORTER_OTLP_ENDPOINT is typically just host:port. Switch to WithEndpoint which accepts host:port directly. Always use WithInsecure since the relay targets a local collector. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add a shim-side tracing plugin that: 1. Registers an otelttrpc server interceptor so the shim extracts trace context from incoming daemon TTRPC requests 2. Sets the W3C TraceContext propagator 3. Configures a TracerProvider with an OTLP exporter (when OTEL_EXPORTER_OTLP_ENDPOINT is set) so shim spans are exported Without this, the shim's spans (BundleLoad, VMBoot, BundleTransfer, IOSetup, TaskCreate, Start) landed in a separate trace because the daemon injected trace context but the shim never extracted it. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 23 out of 24 changed files in this pull request and generated 16 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if runtime.GOOS == "windows" { | ||
| d = 500 * time.Millisecond | ||
| } | ||
| _, ttrpcWaitSpan := tracer.Start(ctx, "libkrun.WaitForTTRPC") |
There was a problem hiding this comment.
The spans started here ignore the updated context returned by tracer.Start. This makes libkrun.WaitForTTRPC a sibling of libkrun.VMStart (and can break intended nesting/parenting). Use the ctx returned from starting libkrun.VMStart when starting subsequent spans (e.g., ctx, krunStartSpan := ... then ctx, ttrpcWaitSpan := ...).
| _, ttrpcWaitSpan := tracer.Start(ctx, "libkrun.WaitForTTRPC") | |
| ctx, ttrpcWaitSpan := tracer.Start(ctx, "libkrun.WaitForTTRPC") |
| ctx, createSpan := vmTracer.Start(ctx, "task.Create") | ||
| defer createSpan.End() |
There was a problem hiding this comment.
Reassigning the main ctx variable to child-span contexts (netSpan/waitNetSpan/containerStartSpan) causes subsequent spans/operations to be parented under the most recently started span, even after that span has ended. This can unintentionally produce a chained/nested trace structure. Prefer keeping the parent ctx unchanged and using a separate variable for the child span (e.g., netCtx := ctx from Start(...), pass netCtx to Connect, then continue using the original ctx for later work/spans).
| ctx, netSpan := vmTracer.Start(ctx, "ctrnetworking.Connect") | ||
| waitForConnect, err := ctrnetworking.Connect(ctx, r.Bundle, container.Pid()) |
There was a problem hiding this comment.
Reassigning the main ctx variable to child-span contexts (netSpan/waitNetSpan/containerStartSpan) causes subsequent spans/operations to be parented under the most recently started span, even after that span has ended. This can unintentionally produce a chained/nested trace structure. Prefer keeping the parent ctx unchanged and using a separate variable for the child span (e.g., netCtx := ctx from Start(...), pass netCtx to Connect, then continue using the original ctx for later work/spans).
| ctx, netSpan := vmTracer.Start(ctx, "ctrnetworking.Connect") | |
| waitForConnect, err := ctrnetworking.Connect(ctx, r.Bundle, container.Pid()) | |
| netCtx, netSpan := vmTracer.Start(ctx, "ctrnetworking.Connect") | |
| waitForConnect, err := ctrnetworking.Connect(netCtx, r.Bundle, container.Pid()) |
| ctx, startSpan := vmTracer.Start(ctx, "task.Start") | ||
| defer startSpan.End() | ||
|
|
||
| ctx, waitNetSpan := vmTracer.Start(ctx, "waitForCtrNetConnect") | ||
| if err := s.waitForCtrNetConnect(ctx, r.ID); err != nil { | ||
| waitNetSpan.End() | ||
| return nil, errgrpc.ToGRPC(err) | ||
| } | ||
| waitNetSpan.End() |
There was a problem hiding this comment.
Reassigning the main ctx variable to child-span contexts (netSpan/waitNetSpan/containerStartSpan) causes subsequent spans/operations to be parented under the most recently started span, even after that span has ended. This can unintentionally produce a chained/nested trace structure. Prefer keeping the parent ctx unchanged and using a separate variable for the child span (e.g., netCtx := ctx from Start(...), pass netCtx to Connect, then continue using the original ctx for later work/spans).
| ctx, containerStartSpan := vmTracer.Start(ctx, "container.Start") | ||
| p, err := container.Start(ctx, r) |
There was a problem hiding this comment.
Reassigning the main ctx variable to child-span contexts (netSpan/waitNetSpan/containerStartSpan) causes subsequent spans/operations to be parented under the most recently started span, even after that span has ended. This can unintentionally produce a chained/nested trace structure. Prefer keeping the parent ctx unchanged and using a separate variable for the child span (e.g., netCtx := ctx from Start(...), pass netCtx to Connect, then continue using the original ctx for later work/spans).
| ctx, containerStartSpan := vmTracer.Start(ctx, "container.Start") | |
| p, err := container.Start(ctx, r) | |
| startCtx, containerStartSpan := vmTracer.Start(ctx, "container.Start") | |
| p, err := container.Start(startCtx, r) |
| Options: r.Options, | ||
| } | ||
|
|
||
| ctx, taskCreateSpan := shimTracer.Start(ctx, "shim.TaskCreate") |
There was a problem hiding this comment.
Similar to the vminit side, repeatedly overwriting ctx with child-span contexts will chain parentage (each subsequent span becomes a child of the previous span’s context), even though the previous span was ended. If the intent is for these spans to be siblings under the RPC/root span, start child spans using scoped variables (or small helper functions) so the outer ctx remains the parent for later spans.
| } | ||
| return nil, errgrpc.ToGRPC(err) | ||
| } | ||
| taskCreateSpan.End() |
There was a problem hiding this comment.
Similar to the vminit side, repeatedly overwriting ctx with child-span contexts will chain parentage (each subsequent span becomes a child of the previous span’s context), even though the previous span was ended. If the intent is for these spans to be siblings under the RPC/root span, start child spans using scoped variables (or small helper functions) so the outer ctx remains the parent for later spans.
| otlptracehttp.WithEndpoint(endpoint), | ||
| otlptracehttp.WithInsecure(), |
There was a problem hiding this comment.
OTEL_EXPORTER_OTLP_ENDPOINT is commonly set to a full URL (e.g., http://localhost:4318). otlptracehttp.WithEndpoint expects a host[:port] without scheme, so this will fail for typical configurations. Prefer using otlptracehttp.WithEndpointURL(endpoint), or parse/strip the scheme safely before calling WithEndpoint. Also consider not forcing WithInsecure unconditionally (it will break HTTPS endpoints).
| otlptracehttp.WithEndpoint(endpoint), | |
| otlptracehttp.WithInsecure(), | |
| otlptracehttp.WithEndpointURL(endpoint), |
| // The VM's RTC has only second-level resolution, so its wall clock | ||
| // can be up to ~1s behind the host. We compute the offset from the | ||
| // first otelttrpc interceptor span (which is created at the moment | ||
| // the first ttrpc RPC reaches the VM — a known sync point with the | ||
| // host). hostBootTime was captured on the host at the same logical | ||
| // moment (when ttrpc became responsive). | ||
| var clockOffset time.Duration | ||
| offsetComputed := false |
There was a problem hiding this comment.
The implementation computes clockOffset from the first span received on the stream, but the comment (and PR description) describes using a known sync point (e.g., an otelttrpc interceptor span / sending a host timestamp to the VM before tracing). To avoid misleading or inaccurate skew correction, either (a) update the comment/PR description to match the actual behavior and assumptions, or (b) implement an explicit sync mechanism (e.g., a dedicated ‘sync’ span/message or filtering by a deterministic span name) before computing the offset.
| """ | ||
| import json | ||
| import sys | ||
| import tempfile |
There was a problem hiding this comment.
tempfile is imported but not used. Removing unused imports helps keep the script minimal and avoids confusion about intended behavior.
| import tempfile |
Help us understand (and later optimise) VM/container start time.
If you run this:
It produces this:
The biggest thing on the critical path is the VM start + initial connection, which needs breaking down more (future investigation)
We can also share Mermaid diagrams which can be rendered on GitHub by running:
which produces this:
gantt dateFormat x axisFormat %s.%L s title Trace for nerdbox-vm section VM.NewInstance+Start nerdbox/VM.NewInstance+Start :VM_NewInstance+Start_9f6fda, 0, 83 . nerdbox/libkrun.VMStart :libkrun_VMStart_90885a, 5, 82 . nerdbox/libkrun.WaitForTTRPC :libkrun_WaitForTTRPC_10af7c, 5, 82 . nerdbox/TTRPC.Task.Create :TTRPC_Task_Create_5e832c, 83, 85 . . nerdbox/containerd.task.v3.Task/Create :containerd_task_v3_Task_Create_3a4175, 84, 85 . . . containerd.task.v3.Task/Create :containerd_task_v3_Task_Create_61f5b3, 83, 85 . . . . task.Create :task_Create_a0b789, 84, 85 . . . . . runc.NewContainer :runc_NewContainer_f4a190, 84, 85 . . . . . . crun.create :crun_create_beaf4f, 84, 85 . nerdbox/TTRPC.Task.Start :TTRPC_Task_Start_c8de57, 85, 87 . . nerdbox/containerd.task.v3.Task/Start :containerd_task_v3_Task_Start_f0139b, 85, 87 . . . containerd.task.v3.Task/Start :containerd_task_v3_Task_Start_1d5297, 85, 87 . . . . task.Start :task_Start_84961b, 85, 87 . . . . . . container.Start :container_Start_5372d6, 85, 87 . . . . . . . crun.start :crun_start_ce160c, 85, 87The first patch adds the infra and tracing for the host part. This gets most of the benefit IMHO.
The second patch extends the tracing to the VM: this is a bit complicated because I didn't want to assume there was a network, so the spans are relayed back to the host and then forwarded to the collector.
The clock is interesting because the RTC only has 1s granularity, so to correct we send the host timestamp to the VM before the tracing.