AvalaraSite Reliability Engineer
Aug. 2024The Site Reliability Engineer is responsible for security, compliance, reliability, and efficiency of our production systems. The role is a member of the Engineering Operations Center (EOC). The EOC is responsible for 24x7 availability, incident management and analysis of escalated (Tier2) client support tickets. The EOC is Avalara’s first responder for all customer facing engineering events. EOC Site Reliability Engineers enhance observability, troubleshoot applications, manage Google Cloud and AWS infrastructure, write tools & automation, and impact the design of the future platform architecture. The EOC is also a primary gate to review and approve new application readiness. The role supports multiple dozens of projects and applications. Proactively monitor and first tier response for Avalara customer facing, revenue critical applications.
We deliver multiple nines of availability. Our focus is on prevention through early identification of potential issues.
Acceptance of new application to ensure supportability by the EOC .
The EOC is a gate to ensure consistent standards of operational excellence using a comprehensive set Software Maturity Metrics (SMM).
Deploy applications in support of technical and compliance requirements.
Manage Incidents, driving consistency through standard processes.
Recommend application changes to improve application performance.
Work with Google Cloud, AWS, GitLab, Atlassian suite, PowerShell, Python, Terraform & HashiCorp suite, Snowflake, MongoDB, DynamoDB, Postgres, SQL, Packer, SumoLogic, Grafana, Prometheus, HAproxy, RabbitMQ, Windows, Linux, Docker, and containerization technologies etc.