Senior Site Reliability Engineer

barry-callebaut

Hyderabad 10 Years Exp Posted 349d ago

MAIN RESPONSIBILITIES & SCOPE

Ensure scalability, performance, and reliability of large-scale, cloud-based applications and infrastructure
Establish monitoring and observability solutions and address performance bottlenecks, errors and other issues
Develop and maintain automated deployment pipelines to facilitate seamless and efficient delivery of software updates while minimizing downtime
Develop and implement strategies to enable zero downtime deployments
Resolve incidents promptly to minimize service disruptions
Create and enforce best practices and standards for the deployment and management of applications, databases, and other resources
Work closely with cross-functional teams, including developers, DevOps engineers, and QA engineers, to drive continuous improvement and innovation

ESSENTIAL EXPERIENCE & KNOWLEDGE / TECHNICAL OR FUNCTIONAL COMPETENCIES

Good knowledge of IT infrastructures, cloud operations, as well as the design, implementation, and management of highly available and scalable infrastructure

Proficiency in Azure services, Terraform, observability tools, techniques for monitoring and troubleshooting distributed systems
Experience with zero downtime deployment strategies and DevOps tools (e.g. Jenkins, CircleCI, Github)

Openness to try and learn new technologies and skills
Good written and verbal communication skills, being able to communicate problems to non-technical audiences