[go: up one dir, main page]

Add runbook for scaling CustomersDot VMs

Summary

This MR adds a comprehensive runbook for scaling CustomersDot VMs, documenting the process discovered and tested in gitlab-com/gl-infra/production-engineering#27880 (closed).

What does this MR do?

Adds docs/customersdot/scaling-vms.md which documents:

  • Horizontal scaling (adding new VMs): Complete step-by-step process including infrastructure changes, Teleport setup, provisioning, and deployment
  • Vertical scaling (resizing existing VMs): Process for changing machine types with minimal downtime
  • Troubleshooting: Common issues encountered during the testing phase and their solutions
  • Access and permissions: Who can perform these operations and what access is required
  • Timeline expectations: Realistic time estimates for scaling operations

Why is this needed?

This capability is critical for:

  • Usage Billing GA preparation
  • Handling traffic spikes
  • Emergency scaling during incidents
  • Planned capacity increases

The testing in #27880 revealed several non-obvious steps and gotchas that need to be documented for future scaling operations.

Key learnings from testing

  1. Teleport tokens must be created in the same MR as the VM creation
  2. The pet_name=customers label is required for Ansible discovery
  3. VMs must use a specific Ubuntu 20.04 boot image
  4. Provisioning and deployment require Fulfillment team involvement
  5. Vertical scaling can be done with ~2-5 minutes downtime per VM

Related issues and MRs

  • Discovery issue: gitlab-com/gl-infra/production-engineering#27880 (closed)
  • Node map conversion (staging): ops.gitlab.net/gitlab-com/gl-infra/config-mgmt!12504
  • Node map conversion (production): ops.gitlab.net/gitlab-com/gl-infra/config-mgmt!12530
  • Example horizontal scaling: ops.gitlab.net/gitlab-com/gl-infra/config-mgmt!12567
  • Example vertical scaling: ops.gitlab.net/gitlab-com/gl-infra/config-mgmt!12571

cc: @stejacks-gitlab @gsgl @cmcfarland @ebaque @jameslopez

Merge request reports

Loading