← Return to Selected Works

Production Engine

AF-SOUTH-1SHA-256: A1B2C3LIVE

Production Engine is the core operational daemon running the KNUST private cloud. It continuously monitors cluster health, collects real-time telemetry from all hypervisors, and triggers autoscaling events when load thresholds are breached.

Architecture

The daemon runs as a privileged service on each manager node, communicating with the OpenStack API to provision or decommission compute resources. State is persisted to an embedded SQLite store with a write-ahead log for crash recovery.

type ClusterMonitor struct {
    client    *openstack.Client
    db        *sql.DB
    threshold ScalingThreshold
}

func (m *ClusterMonitor) EvaluateLoad(ctx context.Context) error {
    metrics, err := m.client.GetClusterMetrics(ctx)
    if err != nil {
        return fmt.Errorf("metrics fetch: %w", err)
    }
    if metrics.CPUPercent > m.threshold.ScaleUp {
        return m.scaleOut(ctx, 1)
    }
    return nil
}

Observability

All scaling events are emitted as structured logs consumed by a Loki instance. Grafana dashboards surface cluster utilisation across a 7-day rolling window with per-hypervisor breakdowns.

Status

Live in production, handling approximately 200 VMs across 3 availability zones. Average response time for a scale-out event is under 90 seconds from threshold breach to node registration.