Slacker is a modern Go application following clean architecture principles.
This codebase follows Rob Pike's philosophy:
"Simplicity is the ultimate sophistication." "A little copying is better than a little dependency." "The bigger the interface, the weaker the abstraction."
internal/packages - Implementation is not importable externally- Interfaces at consumption - Defined where used, not where implemented
- No circular dependencies - Clean dependency graph
- Context everywhere - All blocking operations accept
context.Context - Minimal public surface - Only export what's necessary
slacker/
├── cmd/
│ ├── server/ # Main server binary
│ └── slack-registrar/ # Slack app registration tool
├── internal/ # Implementation packages (not importable)
│ ├── bot/ # Core orchestration
│ ├── config/ # YAML configuration
│ ├── github/ # GitHub API client
│ ├── notify/ # Notification logic
│ ├── slack/ # Slack API client
│ └── usermapping/ # GitHub↔Slack mapping
└── .claude/ # Claude Code configuration
GitHub Webhook
↓
Sprinkler (WebSocket)
↓
bot.Coordinator.processEvent()
├→ Load config from .codeGROOVE/slack.yaml
├→ Analyze PR with turnclient
├→ Post to Slack channels
└→ Schedule DM notifications
↓
notify.Manager
├→ Check if user active
├→ Apply delay logic
└→ Send DM via slack.Client
State uses a hybrid approach - in-memory cache with persistent storage:
In-Memory (Fast Path):
- PR threads - Cached in
bot.ThreadCache(map of PR → Slack thread) - Notifications - Tracked in
notify.NotificationTracker(when we last DM'd) - User mappings - Cached in
usermapping.Service(GitHub → Slack, 24h TTL) - Config - Cached in
config.Manager(per-org YAML, reloaded on push) - Event deduplication - Recent events in memory (1 hour window)
Persistent (Survives Restarts):
- JSON files - Local storage in
os.UserCacheDir()(simple, reliable, easy to debug) - Event deduplication - Prevents duplicate messages across restarts (24 hour retention)
- Thread mapping - PR → Slack thread timestamps (30 day retention)
- DM tracking - When each user was last notified (90 day retention)
- Optional Datastore - Google Cloud Datastore for multi-instance coordination
The JSON store provides reliable single-instance operation. Datastore adds cross-instance deduplication for rolling deployments.
- Persistent event deduplication - Uses both persistent state and in-memory cache to prevent duplicate messages across restarts
- Cross-instance coordination - 100ms delay + Slack history search prevents duplicate thread creation during rolling deployments
- Startup reconciliation - On startup, checks all open PRs from last 24 hours and sends any missed notifications
- Periodic polling - Every 5 minutes as a safety net to catch anything webhooks missed
- Automatic cleanup - Hourly cleanup removes old state (events >24h, threads >30d, DMs >90d)
- All caches use
sync.RWMutexfor thread-safety - Channel processing uses
sync.WaitGroupfor parallel execution - DM sending runs in separate goroutines with timeouts
- Contexts propagate cancellation through the stack
- Double-check locking prevents duplicate thread creation races
- HTTP server - Handles Slack webhooks
- Bot coordinators - One per GitHub org (long-running)
- Notification scheduler - Checks for pending notifications
- DM senders - Fire-and-forget with 2min timeout
Errors are wrapped for context:
if err != nil {
return fmt.Errorf("failed to post thread: %w", err)
}Then checked with errors.Is() for specific handling.
External API calls use exponential backoff with jitter:
retry.Do(fn,
retry.Attempts(5),
retry.Delay(2*time.Second),
retry.MaxDelay(2*time.Minute),
retry.DelayType(retry.BackOffDelay),
retry.MaxJitter(time.Second),
)- Unit tests for
usermappingpackage - Integration tests would require mocking external APIs
-
Define interface in your test file:
type slackClient interface { PostThread(ctx, channelID, text string) (string, error) }
-
Create simple mock:
type mockSlack struct { postThreadFunc func(context.Context, string, string) (string, error) }
-
Use table-driven tests:
tests := []struct{ name string want string }{ {"case1", "expected1"}, }
Don't create a separate mocks package unless you need to share mocks.
Configuration is pull-based from GitHub repos:
# .codeGROOVE/slack.yaml in target repo
global:
slack: workspace.slack.com
reminder_dm_delay: 65 # minutes
channels:
engineering:
repos: ["backend", "frontend"]The bot reads this file when processing PRs. Changes take effect on next PR event.
Built as a single static binary. No runtime dependencies.
GITHUB_APP_ID=123456
GITHUB_PRIVATE_KEY=-----BEGIN RSA PRIVATE KEY-----
SLACK_SIGNING_SECRET=abc123
SPRINKLER_URL=wss://sprinkler.example.com/wsSecrets are fetched from Google Secret Manager if not in environment.
/health- Basic liveness (is server responding?)/healthz- Detailed readiness (are coordinators running?)
- Slack API responses - Cached with TTL (team info: 1h, bot info: 1h)
- Channel resolution - Cached to avoid repeated lookups
- User mappings - 24h TTL, lazy cleanup
- PR threads - Indefinite (until coordinator restarts)
- Parallel channel processing - WaitGroup for concurrent Slack posts
- Async DM sending - Don't block PR processing
- Lazy caching - Only cache on first miss
- Context timeouts - 30s for turnclient, 2min for DMs
- Webhook signature verification - All Slack requests verified with HMAC
- Token isolation - Each workspace has separate Slack token in GSM
- No token logging - Secrets never logged
- Rate limiting - Built into retry logic
- Input validation - Channel names, user IDs sanitized
Structured logging with slog:
slog.Info("processing PR",
"owner", owner,
"repo", repo,
"number", prNumber,
"state", prState)Log levels: Debug (development), Info (production), Warn (recoverable), Error (requires attention).
Add Prometheus metrics:
prProcessed.WithLabelValues(owner, repo, state).Inc()
apiLatency.WithLabelValues("slack", "post_message", "200").Observe(duration)// Pass context through
func process(ctx context.Context, ...) error {
// Use for cancellation
select {
case <-ctx.Done():
return ctx.Err()
case result := <-ch:
// ...
}
}if err != nil {
return fmt.Errorf("operation failed for %s: %w", id, err)
}eg, ctx := errgroup.WithContext(ctx)
eg.Go(func() error {
<-ctx.Done()
return server.Shutdown(context.WithTimeout(context.Background(), 5*time.Second))
})Potential improvements (not currently needed):
- Persistent cache - Redis for state across restarts
- Circuit breakers - Prevent cascade failures
- Distributed tracing - OpenTelemetry
- Metrics - Prometheus/Grafana
- Integration tests - Test harness with mocks
The current design supports all of these without major refactoring.