← Return to Selected Works

Service Autoscaler

EU-WEST-1SHA-256: D4E5F6LIVE

Service Autoscaler is a lightweight Go daemon that subscribes to a Prometheus metrics stream and drives Docker Swarm service scaling decisions without any manual intervention.

How It Works

The autoscaler queries a set of configurable PromQL expressions on a 15-second tick. When a metric crosses a defined threshold, it calls the Swarm API to update the target replica count. Scale-down events include a stabilisation window to prevent thrashing.

func (a *Autoscaler) Tick(ctx context.Context) {
    for _, rule := range a.rules {
        val, err := a.prom.QueryInstant(ctx, rule.Query)
        if err != nil || val == nil {
            continue
        }
        if *val > rule.ScaleUpThreshold {
            a.swarm.Scale(ctx, rule.Service, rule.CurrentReplicas+rule.Step)
        } else if *val < rule.ScaleDownThreshold && rule.StabilisedFor(5*time.Minute) {
            a.swarm.Scale(ctx, rule.Service, max(rule.MinReplicas, rule.CurrentReplicas-rule.Step))
        }
    }
}

Configuration

Rules are declared in a YAML file and hot-reloaded on change via inotify, so threshold adjustments don't require a daemon restart.

Status

Running in production. Has handled scale events across 12 services with zero missed scaling opportunities recorded over a 90-day observation window.