Specialist DevOps & Site Reliability Engineer

equilend

Bengaluru 6 Years Exp Posted 421d ago

Job Description

Role Responsibilities 

  • Design, build, and manage CI/CD pipelines that streamline software delivery and reduce lead time to production.
  • Develop and support scalable, containerized solutions using Docker and Kubernetes.
  • Implement and manage Infrastructure-as-Code (IaC) using tools such as Terraform and Ansible to ensure consistent environments.
  • Lead incident response and post-mortem processes, championing best practices in availability, latency, and system resilience.
  • Define and maintain service-level indicators (SLIs), objectives (SLOs), and agreements (SLAs) for key systems.
  • Collaborate with application teams to define monitoring strategies and observability standards using Grafana, Prometheus, or similar tooling.
  • Partner with global DevOps and infrastructure teams to drive automation, performance improvements, and cost optimization in cloud environments (AWS preferred).
  • Provide technical mentorship to team members and act as a subject matter expert in Site Reliability Engineering practices.
  • Contribute to the development of operational playbooks and automated runbooks for common failure scenarios.
  • Continuously evaluate and adopt emerging tools and technologies that support the goals of high availability and rapid delivery. 

Required Skills 

  • A minimum of 6+ years of relevant experience in DevOps, SRE, or software infrastructure roles.
  • Strong practical knowledge of CI/CD tooling (e.g., Jenkins, GitLab CI/CD, GitHub Actions) in distributed environments.
  • Proven expertise with containerization and orchestration, especially Docker and Kubernetes.
  • Hands-on experience with Infrastructure-as-Code tools like Terraform, Ansible, or Pulumi.
  • Proficiency in scripting languages such as Python, Bash, or similar for automation and tooling.
  • Experience implementing SRE principles, including reliability metrics, SLIs/SLOs, and chaos engineering practices.
  • Strong familiarity with cloud infrastructure, ideally AWS; Azure or GCP experience also valuable.
  • Demonstrated experience with monitoring and alerting frameworks, e.g., Prometheus, Grafana, ELK, or Splunk.
  • AWS certification (Solutions Architect or DevOps Engineer) is a plus.
  • Excellent collaboration and communication skills, with experience operating across globally distributed teams. 

Similar Openings for You