choreographSite Reliability Engineer
Nov. 2021Designed, built, and deployed MCP (Model Control Plane) — a production-grade platform managing over 1,000 GCP projects via real-time scanning, automated Pulumi generation, Slack-based AI chat, and infrastructure indexing. Fully replaced Terraform with a Pulumi-based IaC pipeline, including a custom stack generator that converts live GCP resources into deployable YAML and state files. Deployed and secured an in-house GitLab instance, integrating GitOps workflows across teams and enforcing IAM and audit policies. Migrated legacy Jenkins pipelines with zero downtime, introducing optimised CI/CD strategies and auto-healing job flows. Created a Slack-integrated LLM agent for infrastructure queries, cost analysis, and live role audits, powered by Vertex AI Matching Engine and GKE-hosted Ollama models. Indexed and embedded content from Confluence, Jira, and GitLab into Vertex AI, enabling context-aware retrieval and RAG-based automation across the org. Developed a secure, frontend dashboard with real-time authentication, enabling team-level interaction with infrastructure insights, AI support, and Pulumi stack previews — no CLI required.