Site Reliability Engineer

equifax

Trivandrum 5 Years Exp Posted 521d ago

Job Description

What you'll do:

  • Kubernetes: Design, deploy, and manage production-ready Kubernetes clusters.

  • Cloud Infrastructure: Build and maintain scalable infrastructure on GCP using tools like Terraform.

  • Performance: Identify and resolve performance bottlenecks in applications and infrastructure.

  • Observability: Implement monitoring and logging to proactively detect and resolve issues.

  • Incident Response: Participate in on-call rotations, troubleshooting and resolving production incidents.

  • Collaboration: Promote reliability best practices and ensure smooth deployments.

  • Automation: Build CI/CD pipelines, automated tooling, and runbooks.

  • Problem Solving: Triage complex issues, lead blameless postmortems, and drive remediation.

  • Mentorship: Guide and mentor other SREs.

What experience you need

  • BS in Computer Science or related field.

  • 2+ years of experience developing and/or administering software in public cloud

  • 5+ years of programming experience (Python, Bash/Shell Script, Java, Go, etc.).

  • 3+ years of experience monitoring infrastructure and application performance.

  • 5+ years experience of system administration skills, including automation and orchestration of Linux/Windows using Terraform, Chef, Ansible and/or containers (Docker, Kubernetes, etc.)

  • 5+ years experience working with continuous integration and continuous delivery tooling and practices

Similar Openings for You