SRE Manager
appliedmaterials
Job Description
- CI/CD & DevOps Platform Engineering
Own and operate enterprise CI/CD and DevOps platforms (see Technical Skills for tool details). Standardize and continuously improve build systems and deployment workflows. Ensure platform reliability, SRE-driven availability, and scalability across global R&D teams. Drive improvements leveraging automation, cloud-native patterns, and observability.
Tier‑1/2 Engineering Support Management
Lead a team managing 24×6 Tier‑1/2 tickets spanning multiple technologies and engineering functions. Ensure accurate triage, categorization, prioritization, and SLA adherence. Maintain MTTR, response time SLAs, quality of resolution, and ticket hygiene. Reduce operational noise through automation, clear scope definitions, and knowledge base maturity.
Genesis DevOps & SRE Team Management
Lead the Genesis DevOps team supporting Azure Cloud infrastructure, reliability engineering, and cloud automation (see Technical Skills for tool details). Own incident management, root cause analysis (RCA), preventive actions, change management, and release planning. Drive SRE practices: error budgets, SLIs/SLOs, on-call readiness, runbooks, and service health dashboards.
Automation, Self‑Service & GenAI Enablement
Drive automation-first culture across support teams and CI/CD platforms. Implement self-service capabilities for build, deploy, access requests, monitoring and DevOps operations. Architect and expand GenAI/ChatOps solutions, enabling natural language querying of Jira, Confluence, CI/CD, and operational data. Reduce manual dependency through workflow orchestration and AI-enabled troubleshooting.
Operational Excellence & Governance
Maintain and improve key KPIs: MTTR, first contact resolution, deployment success rates, platform uptime, ticket backlog, and SRE scorecards. Lead structured incident reviews, impact assessments, problem management, and reliability-driven improvements. Partner with Infrastructure, Cloud, Security, ALM, and Product Engineering teams. Own audits, compliance, documentation hygiene, and DR processes.
Technical Skills
- CI/CD & DevOps Tools: Jenkins, Git, Bitbucket, GitLab, SonarQube, Nexus, Artifactory
- Containers & Cloud: Kubernetes, Docker, Azure
- Automation & Scripting: Python, Shell, YAML, Ansible, Terraform
- Observability & Quality: Prometheus, Nagios, Zabbix, Black Duck
- ALM & Collaboration: Jira, Confluence, Agile/Scrum
- AI & Automation: GenAI ChatOps, workflow orchestration
People Leadership
- Manage DevOps engineers, SREs, and Tier‑1/2 support personnel.
- Drive performance management, coaching, upskilling plans, and role development.
- Establish structured onboarding, learning paths, and DevOps/SRE competency frameworks.
- Create a culture of ownership, collaboration, customer focus, and continuous improvement.
Problem Solving
- Identify and resolves technical, operational and organizational problems
Required Skills & Experience
- 12+ years in DevOps/SRE, including 4+ years leading teams.
- Strong expertise in:
- CI/CD: Jenkins, Azure DevOps or any similar.
- Cloud & Containers: Azure, Kubernetes, Docker, Helm, Rancher.
- Scripting & IaC: Python, Bash, YAML, Ansible, Terraform.
- Observability: Prometheus, Grafana, Loki, Zabbix, Nagios.
- ALM & Collaboration: Jira, Confluence; Agile/Scrum.