Site Reliability Engineer (SRE)
Capgemini
Job Description
Your Role
- Design, implement, and manage Rancher-managed Kubernetes clusters using RKE2/K3s.
- Automate infrastructure provisioning and configuration using Terraform, Helm, and Argo CD.
- Optimize Linux systems for performance, networking, and kernel-level tuning.
- Drive observability and monitoring using Prometheus, Grafana, and ELK/EFK stack.
- Implement secure CI/CD pipelines aligned with PCI, FIPS, and CIS benchmarks.
Your Profile
- 7+ years of experience in infrastructure, platform engineering, or SRE roles.
- Proven expertise in Kubernetes (RKE2/K3s) and Rancher-managed environments.
- Strong proficiency in Python or Go for scripting and automation.
- Deep understanding of Linux systems, performance tuning, and bare-metal optimization.
- Hands-on experience with observability tools and SRE practices.
- Familiarity with OpenStack integration and Rancher add-ons like Fleet and Longhorn.
What You Will Love Working at Capgemini
- Lead cutting-edge infrastructure projects supporting mission-critical systems.
- Expand your expertise in Kubernetes, GitOps, automation, and platform engineering.
- Clear career progression from engineering to architecture and consulting roles.
- Be part of high-impact initiatives that drive performance and operational excellence.
- Thrive in a diverse, inclusive, and collaborative environment that values innovation and your voice.