Architecture

Slacker is a modern Go application following clean architecture principles.

Principles

This codebase follows Rob Pike's philosophy:

"Simplicity is the ultimate sophistication." "A little copying is better than a little dependency." "The bigger the interface, the weaker the abstraction."

Key Design Decisions

internal/ packages - Implementation is not importable externally
Interfaces at consumption - Defined where used, not where implemented
No circular dependencies - Clean dependency graph
Context everywhere - All blocking operations accept context.Context
Minimal public surface - Only export what's necessary

Directory Structure

slacker/
├── cmd/
│   ├── server/              # Main server binary
│   └── slack-registrar/     # Slack app registration tool
├── internal/                # Implementation packages (not importable)
│   ├── bot/                # Core orchestration
│   ├── config/             # YAML configuration
│   ├── github/             # GitHub API client
│   ├── notify/             # Notification logic
│   ├── slack/              # Slack API client
│   └── usermapping/        # GitHub↔Slack mapping
└── .claude/                # Claude Code configuration

Data Flow

GitHub Webhook
    ↓
Sprinkler (WebSocket)
    ↓
bot.Coordinator.processEvent()
    ├→ Load config from .codeGROOVE/slack.yaml
    ├→ Analyze PR with turnclient
    ├→ Post to Slack channels
    └→ Schedule DM notifications
        ↓
    notify.Manager
        ├→ Check if user active
        ├→ Apply delay logic
        └→ Send DM via slack.Client

State Management

State uses a hybrid approach - in-memory cache with persistent storage:

In-Memory (Fast Path):

PR threads - Cached in bot.ThreadCache (map of PR → Slack thread)
Notifications - Tracked in notify.NotificationTracker (when we last DM'd)
User mappings - Cached in usermapping.Service (GitHub → Slack, 24h TTL)
Config - Cached in config.Manager (per-org YAML, reloaded on push)
Event deduplication - Recent events in memory (1 hour window)

Persistent (Survives Restarts):

JSON files - Local storage in os.UserCacheDir() (simple, reliable, easy to debug)
Event deduplication - Prevents duplicate messages across restarts (24 hour retention)
Thread mapping - PR → Slack thread timestamps (30 day retention)
DM tracking - When each user was last notified (90 day retention)
Optional Datastore - Google Cloud Datastore for multi-instance coordination

The JSON store provides reliable single-instance operation. Datastore adds cross-instance deduplication for rolling deployments.

Reliability Features

Persistent event deduplication - Uses both persistent state and in-memory cache to prevent duplicate messages across restarts
Cross-instance coordination - 100ms delay + Slack history search prevents duplicate thread creation during rolling deployments
Startup reconciliation - On startup, checks all open PRs from last 24 hours and sends any missed notifications
Periodic polling - Every 5 minutes as a safety net to catch anything webhooks missed
Automatic cleanup - Hourly cleanup removes old state (events >24h, threads >30d, DMs >90d)

Concurrency

Safe Patterns

All caches use sync.RWMutex for thread-safety
Channel processing uses sync.WaitGroup for parallel execution
DM sending runs in separate goroutines with timeouts
Contexts propagate cancellation through the stack
Double-check locking prevents duplicate thread creation races

Key Goroutines

HTTP server - Handles Slack webhooks
Bot coordinators - One per GitHub org (long-running)
Notification scheduler - Checks for pending notifications
DM senders - Fire-and-forget with 2min timeout

Error Handling

Errors are wrapped for context:

if err != nil {
    return fmt.Errorf("failed to post thread: %w", err)
}

Then checked with errors.Is() for specific handling.

Retry Strategy

External API calls use exponential backoff with jitter:

retry.Do(fn,
    retry.Attempts(5),
    retry.Delay(2*time.Second),
    retry.MaxDelay(2*time.Minute),
    retry.DelayType(retry.BackOffDelay),
    retry.MaxJitter(time.Second),
)

Testing Strategy

Current State

Unit tests for usermapping package
Integration tests would require mocking external APIs

How to Add Tests

Define interface in your test file:

type slackClient interface {
    PostThread(ctx, channelID, text string) (string, error)
}

Create simple mock:

type mockSlack struct {
    postThreadFunc func(context.Context, string, string) (string, error)
}

Use table-driven tests:

tests := []struct{
    name string
    want string
}{
    {"case1", "expected1"},
}

Don't create a separate mocks package unless you need to share mocks.

Configuration

Configuration is pull-based from GitHub repos:

# .codeGROOVE/slack.yaml in target repo
global:
    slack: workspace.slack.com
    reminder_dm_delay: 65  # minutes

channels:
    engineering:
        repos: ["backend", "frontend"]

The bot reads this file when processing PRs. Changes take effect on next PR event.

Deployment

Built as a single static binary. No runtime dependencies.

Environment Variables

GITHUB_APP_ID=123456
GITHUB_PRIVATE_KEY=-----BEGIN RSA PRIVATE KEY-----
SLACK_SIGNING_SECRET=abc123
SPRINKLER_URL=wss://sprinkler.example.com/ws

Secrets are fetched from Google Secret Manager if not in environment.

Health Checks

/health - Basic liveness (is server responding?)
/healthz - Detailed readiness (are coordinators running?)

Performance

Caching Strategy

Slack API responses - Cached with TTL (team info: 1h, bot info: 1h)
Channel resolution - Cached to avoid repeated lookups
User mappings - 24h TTL, lazy cleanup
PR threads - Indefinite (until coordinator restarts)

Optimizations

Parallel channel processing - WaitGroup for concurrent Slack posts
Async DM sending - Don't block PR processing
Lazy caching - Only cache on first miss
Context timeouts - 30s for turnclient, 2min for DMs

Security

Webhook signature verification - All Slack requests verified with HMAC
Token isolation - Each workspace has separate Slack token in GSM
No token logging - Secrets never logged
Rate limiting - Built into retry logic
Input validation - Channel names, user IDs sanitized

Observability

Logging

Structured logging with slog:

slog.Info("processing PR",
    "owner", owner,
    "repo", repo,
    "number", prNumber,
    "state", prState)

Log levels: Debug (development), Info (production), Warn (recoverable), Error (requires attention).

Future: Metrics

Add Prometheus metrics:

prProcessed.WithLabelValues(owner, repo, state).Inc()
apiLatency.WithLabelValues("slack", "post_message", "200").Observe(duration)

Common Patterns

Context Usage

// Pass context through
func process(ctx context.Context, ...) error {
    // Use for cancellation
    select {
    case <-ctx.Done():
        return ctx.Err()
    case result := <-ch:
        // ...
    }
}

Error Wrapping

if err != nil {
    return fmt.Errorf("operation failed for %s: %w", id, err)
}

Graceful Shutdown

eg, ctx := errgroup.WithContext(ctx)
eg.Go(func() error {
    <-ctx.Done()
    return server.Shutdown(context.WithTimeout(context.Background(), 5*time.Second))
})

Future Enhancements

Potential improvements (not currently needed):

Persistent cache - Redis for state across restarts
Circuit breakers - Prevent cascade failures
Distributed tracing - OpenTelemetry
Metrics - Prometheus/Grafana
Integration tests - Test harness with mocks

The current design supports all of these without major refactoring.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Architecture

Principles

Key Design Decisions

Directory Structure

Data Flow

State Management

Reliability Features

Concurrency

Safe Patterns

Key Goroutines

Error Handling

Retry Strategy

Testing Strategy

Current State

How to Add Tests

Configuration

Deployment

Environment Variables

Health Checks

Performance

Caching Strategy

Optimizations

Security

Observability

Logging

Future: Metrics

Common Patterns

Context Usage

Error Wrapping

Graceful Shutdown

Future Enhancements

FilesExpand file tree

ARCHITECTURE.md

Latest commit

History

ARCHITECTURE.md

File metadata and controls

Architecture

Principles

Key Design Decisions

Directory Structure

Data Flow

State Management

Reliability Features

Concurrency

Safe Patterns

Key Goroutines

Error Handling

Retry Strategy

Testing Strategy

Current State

How to Add Tests

Configuration

Deployment

Environment Variables

Health Checks

Performance

Caching Strategy

Optimizations

Security

Observability

Logging

Future: Metrics

Common Patterns

Context Usage

Error Wrapping

Graceful Shutdown

Future Enhancements