This document describes the overall architecture and package organization of the SUSE Observability Backup CLI.
The codebase follows several key principles:
- Layered Architecture: Dependencies flow from higher layers (commands) to lower layers (foundation utilities)
- Self-Documenting Structure: Directory hierarchy makes dependency rules and module purposes explicit
- Clean Separation: Domain logic, infrastructure, and presentation are clearly separated
- Testability: Lower layers can be tested independently without external dependencies
- Reusability: Shared functionality is extracted into appropriate packages
stackstate-backup-cli/
├── cmd/ # Command-line interface (Layer 4)
│ ├── root.go # Root command and global flags
│ ├── version/ # Version information command
│ ├── elasticsearch/ # Elasticsearch backup/restore commands
│ ├── clickhouse/ # ClickHouse backup/restore commands
│ ├── stackgraph/ # Stackgraph backup/restore commands
│ ├── victoriametrics/ # VictoriaMetrics backup/restore commands
│ └── settings/ # Settings backup/restore commands
│
├── internal/ # Internal packages (Layers 0-3)
│ ├── foundation/ # Layer 0: Core utilities
│ │ ├── config/ # Configuration management
│ │ ├── logger/ # Structured logging
│ │ └── output/ # Output formatting
│ │
│ ├── clients/ # Layer 1: Service clients
│ │ ├── k8s/ # Kubernetes client
│ │ ├── elasticsearch/ # Elasticsearch client
│ │ └── s3/ # S3/Minio client
│ │
│ ├── orchestration/ # Layer 2: Workflows
│ │ ├── portforward/ # Port-forwarding orchestration
│ │ ├── scale/ # Deployment/StatefulSet scaling workflows
│ │ ├── restore/ # Restore job orchestration
│ │ └── restorelock/ # Restore lock mechanism (prevents parallel restores)
│ │
│ ├── app/ # Layer 3: Dependency Container
│ │ └── app.go # Application context and dependency injection
│ │
│ └── scripts/ # Embedded bash scripts
│
├── main.go # Application entry point
├── ARCHITECTURE.md # This file
└── README.md # User documentation
Purpose: User-facing CLI commands and application entry points
Characteristics:
- Implements the Cobra command structure
- Handles user input validation and flag parsing
- Delegates to orchestration and client layers via app context
- Minimal business logic (thin command layer)
- Formats output for end users
Key Packages:
cmd/elasticsearch/: Elasticsearch snapshot/restore commands (configure, list, list-indices, restore, check-and-finalize)cmd/clickhouse/: ClickHouse backup/restore commands (list, restore, check-and-finalize)cmd/stackgraph/: Stackgraph backup/restore commands (list, restore, check-and-finalize)cmd/victoriametrics/: VictoriaMetrics backup/restore commands (list, restore, check-and-finalize)cmd/settings/: Settings backup/restore commands (list, restore, check-and-finalize)cmd/version/: Version information
Dependency Rules:
- ✅ Can import:
internal/app/*(preferred), all otherinternal/packages - ❌ Should not: Create clients directly, contain business logic
Purpose: Centralized dependency initialization and injection
Characteristics:
- Creates and wires all application dependencies
- Provides single entry point for dependency creation
- Eliminates boilerplate from commands
- Improves testability through centralized mocking
Key Components:
Context: Struct holding all dependencies (K8s client, S3 client, ES client, config, logger, formatter)NewContext(): Factory function creating production dependencies from global flags
Usage Pattern:
// In command files
appCtx, err := app.NewContext(globalFlags)
if err != nil {
return err
}
// All dependencies available via appCtx
appCtx.K8sClient
appCtx.S3Client
appCtx.ESClient
appCtx.Config
appCtx.Logger
appCtx.FormatterDependency Rules:
- ✅ Can import: All
internal/packages - ✅ Used by:
cmd/layer only - ❌ Should not: Contain business logic or orchestration
Purpose: High-level workflows that coordinate multiple services
Characteristics:
- Composes multiple clients to implement complex workflows
- Handles sequencing and error recovery
- Provides logging and user feedback
- Stateless operations
Key Packages:
portforward/: Manages Kubernetes port-forwarding lifecyclescale/: Deployment and StatefulSet scaling workflows with detailed loggingrestore/: Restore job orchestration (confirmation, job lifecycle, finalization, resource management)restorelock/: Prevents parallel restore operations using Kubernetes annotations
Dependency Rules:
- ✅ Can import:
internal/foundation/*,internal/clients/* - ❌ Cannot import: Other
internal/orchestration/*(to prevent circular dependencies)
Purpose: Wrappers for external service APIs
Characteristics:
- Thin abstraction over external APIs
- Handles connection and authentication
- Translates between external formats and internal types
- No business logic or orchestration
Key Packages:
k8s/: Kubernetes API operations (Jobs, Pods, Deployments, ConfigMaps, Secrets, Logs)elasticsearch/: Elasticsearch HTTP API (snapshots, indices, datastreams)clickhouse/: ClickHouse Backup API and SQL operations (backups, restore operations, status tracking)s3/: S3/Minio operations (client creation, object filtering)
Dependency Rules:
- ✅ Can import:
internal/foundation/*, standard library, external SDKs - ❌ Cannot import:
internal/orchestration/*, otherinternal/clients/*
Purpose: Core utilities with no internal dependencies
Characteristics:
- Pure utility functions
- No external service dependencies
- Broadly reusable across the application
- Well-tested and stable
Key Packages:
config/: Configuration loading from ConfigMaps, Secrets, environment, and flagslogger/: Structured logging with levels (Debug, Info, Warning, Error, Success)output/: Output formatting (tables, JSON, YAML, messages)
Dependency Rules:
- ✅ Can import: Standard library, external utility libraries
- ❌ Cannot import: Any
internal/packages
1. User invokes CLI command
└─> cmd/victoriametrics/restore.go (or stackgraph/restore.go)
│
2. Parse flags and validate input
└─> Cobra command receives global flags
│
3. Create application context with dependencies
└─> app.NewContext(globalFlags)
├─> internal/clients/k8s/ (K8s client)
├─> internal/foundation/config/ (Load from ConfigMap/Secret)
├─> internal/clients/s3/ (S3/Minio client)
├─> internal/foundation/logger/ (Logger)
└─> internal/foundation/output/ (Formatter)
│
4. Execute business logic with injected dependencies
└─> runRestore(appCtx)
├─> internal/orchestration/restore/ (User confirmation)
├─> internal/orchestration/scale/ (Scale down StatefulSets)
├─> internal/orchestration/restore/ (Ensure resources: ConfigMaps, Secrets)
├─> internal/clients/k8s/ (Create restore Job)
├─> internal/orchestration/restore/ (Wait for completion & cleanup)
└─> internal/orchestration/scale/ (Scale up StatefulSets)
│
5. Format and display results
└─> appCtx.Formatter.PrintTable() or PrintJSON()
All dependencies are created once and injected via app.Context:
// Before (repeated in every command)
func runList(globalFlags *config.CLIGlobalFlags) error {
k8sClient, _ := k8s.NewClient(...)
cfg, _ := config.LoadConfig(...)
s3Client, _ := s3.NewClient(...)
log := logger.New(...)
formatter := output.NewFormatter(...)
// ... use dependencies
}
// After (centralized creation)
func runList(appCtx *app.Context) error {
// All dependencies available immediately
appCtx.K8sClient
appCtx.Config
appCtx.S3Client
appCtx.Logger
appCtx.Formatter
}Benefits:
- Eliminates boilerplate from commands (30-50 lines per command)
- Centralized dependency creation makes testing easier
- Single source of truth for dependency wiring
- Commands are thinner and more focused on business logic
Configuration is loaded with the following precedence (highest to lowest):
- CLI Flags: Explicit user input
- Environment Variables: Runtime configuration
- Kubernetes Secret: Sensitive credentials (overrides ConfigMap)
- Kubernetes ConfigMap: Base configuration
- Defaults: Fallback values
Implementation: internal/foundation/config/config.go
Clients are created with a consistent factory pattern:
// Example from internal/clients/elasticsearch/client.go
func NewClient(endpoint string) (*Client, error) {
// Initialization logic
return &Client{...}, nil
}Services running in Kubernetes are accessed via automatic port-forwarding:
// Example from internal/orchestration/portforward/portforward.go
pf, err := SetupPortForward(k8sClient, namespace, service, localPort, remotePort, log)
defer close(pf.StopChan) // Automatic cleanupDeployments and StatefulSets are scaled down before restore operations and scaled up afterward:
// Example usage
scaledResources, _ := scale.ScaleDown(k8sClient, namespace, selector, log)
defer scale.ScaleUpFromAnnotations(k8sClient, namespace, selector, log)Note: Scaling now supports both Deployments and StatefulSets through a unified interface.
Common restore operations are centralized in the restore orchestration layer:
// User confirmation
if !restore.PromptForConfirmation() {
return fmt.Errorf("operation cancelled")
}
// Wait for job completion and cleanup
restore.PrintWaitingMessage(log, "service-name", jobName, namespace)
err := restore.WaitAndCleanup(k8sClient, namespace, jobName, log, cleanupPVC)
// Check and finalize background jobs
err := restore.CheckAndFinalize(restore.CheckAndFinalizeParams{
K8sClient: k8sClient,
Namespace: namespace,
JobName: jobName,
ServiceName: "service-name",
ScaleSelector: config.ScaleDownLabelSelector,
CleanupPVC: true,
WaitForJob: false,
Log: log,
})Benefits:
- Eliminates duplicate code between Stackgraph and VictoriaMetrics restore commands
- Consistent user experience across services
- Centralized job lifecycle management and cleanup
All operations use structured logging with consistent levels and emoji prefixes for visual clarity:
log.Infof("Starting operation...") // No prefix
log.Debugf("Detail: %v", detail) // 🛠️ DEBUG:
log.Warningf("Non-fatal issue: %v", warning) // ⚠️ Warning:
log.Errorf("Operation failed: %v", err) // ❌ Error:
log.Successf("Operation completed") // ✅The restorelock package prevents parallel restore operations that could corrupt data:
// Scale down with automatic lock acquisition
scaledApps, err := scale.ScaleDownWithLock(scale.ScaleDownWithLockParams{
K8sClient: k8sClient,
Namespace: namespace,
LabelSelector: selector,
Datastore: config.DatastoreStackgraph,
AllSelectors: config.GetAllScaleDownSelectors(),
Log: log,
})
// Scale up and release lock
defer scale.ScaleUpAndReleaseLock(k8sClient, namespace, selector, log)How it works:
- Before scaling down, checks for existing restore locks on Deployments/StatefulSets
- Detects conflicts for the same datastore or mutually exclusive datastores (e.g., Stackgraph and Settings)
- Sets annotations (
stackstate.com/restore-in-progress,stackstate.com/restore-started-at) on resources - Releases locks when scaling up or on failure
Mutual Exclusion Groups:
- Stackgraph and Settings restores are mutually exclusive (both modify HBase data)
- Other datastores (Elasticsearch, ClickHouse, VictoriaMetrics) are independent
- Location: Same directory as source (e.g.,
config_test.go) - Focus: Business logic, parsing, validation
- Mocking: Use interfaces for external dependencies
- Location:
cmd/*/directories - Focus: Command execution with mocked Kubernetes
- Tools:
fake.NewSimpleClientset()fromk8s.io/client-go
- Status: Not yet implemented
- Future: Use
kindork3sfor local Kubernetes cluster testing
- Create command file in
cmd/<service>/ - Implement Cobra command structure
- Use existing clients or create new ones in
internal/clients/ - Implement workflow in
internal/orchestration/if needed - Add tests following existing patterns
- Create package in
internal/clients/<service>/ - Implement client factory:
NewClient(...) (*Client, error) - Only import
internal/foundation/*packages - Add methods for each API operation
- Write unit tests with mocked HTTP/API calls
- Create package in
internal/orchestration/<workflow>/ - Import required clients from
internal/clients/* - Import utilities from
internal/foundation/* - Keep workflows stateless
- Add comprehensive logging
// BAD: internal/clients/elasticsearch/backup.go
import "github.com/.../internal/clients/k8s" // Violates layer rulesFix: Move the orchestration logic to internal/orchestration/
// BAD: cmd/elasticsearch/restore.go
func runRestore() {
// 200 lines of business logic here
}Fix: Extract logic to orchestration or client packages
// BAD: internal/foundation/config/loader.go
import "github.com/.../internal/foundation/output"Fix: Foundation packages should be independent
// BAD
endpoint := "http://localhost:9200"Fix: Use configuration management: config.Elasticsearch.Service.Name
// BAD: cmd/elasticsearch/list.go
func runListSnapshots(globalFlags *config.CLIGlobalFlags) error {
k8sClient, _ := k8s.NewClient(globalFlags.Kubeconfig, globalFlags.Debug)
esClient, _ := elasticsearch.NewClient("http://localhost:9200")
// ... use clients
}Fix: Use app.Context for dependency injection:
// GOOD
func runListSnapshots(appCtx *app.Context) error {
// Dependencies already created
appCtx.K8sClient
appCtx.ESClient
}Verify architectural rules with these commands:
# Verify foundation/ has no internal/ imports
go list -f '{{.ImportPath}}: {{join .Imports "\n"}}' ./internal/foundation/... | \
grep 'stackvista.*internal'
# Verify clients/ only imports foundation/
go list -f '{{.ImportPath}}: {{join .Imports "\n"}}' ./internal/clients/... | \
grep 'stackvista.*internal' | grep -v foundation
# Verify orchestration/ doesn't import other orchestration/
go list -f '{{.ImportPath}}: {{join .Imports "\n"}}' ./internal/orchestration/... | \
grep 'stackvista.*orchestration'