Site Reliability Engineer (SRE)

thomsonreuters

Bengaluru 4 Years Exp Posted 406d ago

Job Description

About the Role:

  • Become an Expert Site Reliability Engineer (SRE): Master best practices in site reliability engineering and DevOps. Proactively integrate essential non-functional requirements into the product backlog, focusing on high availability, scalability, self-healing, observability, continuous delivery, and security.

  • Take the lead in building and maintaining robust monitoring systems for all aspects of infrastructure, microservices, and the platform. Implement a powerful alerting mechanism with cloud-native solutions.

  • Deliver top-notch operational support and engineering for distributed platforms, ensuring optimal performance and reliability.

  • Communicate Effectively: By actively engaging and collaborating with cross-functional partners and team members, you will clearly articulate your ideas and work together on technical developments.

  • Be Innovative: You are empowered to explore new approaches and master new technologies. Your contributions will drive innovation, fuel effective solutions, and ensure seamless end-to-end deliveries.

  • Serve as the Go-To Expert for Production Issues: Take charge of troubleshooting and monitoring any production issues until they are resolved. Your effective communication will be crucial, and you will conduct thorough postmortems to implement valuable insights.

  • Maintain Infrastructure as Code (IaC) and Continuous Integration/Continuous Deployment (CI/CD): Lead the charge in promoting best practices for our CI/CD processes.

  • Commit to Continuous Improvement and Technical Standards: Take the initiative to enhance productivity, monitoring, and tooling, while establishing industry-leading best practices.

  • On-Call Rotation: Actively participate in on-call or shift rotations (Level 2) to ensure our systems run smoothly.

 

 

About You:

  • Bachelor’s degree in computer science or a related field - a must

  •  A minimum of 4+ years of experience as a DevOps/SRE engineer and cloud engineer with hands-on experience in AWS and Azure cloud technologies.

  • Highly skilled in UNIX/Linux-based systems.

  • Proven experience in building and operating production cloud-native infrastructure, applications, and services on AWS or Azure.

  • Experience or knowledge of container technology, such as Docker, Kubernetes, and Istio service mesh.

  • Must have experience using AWS services (such as Cloud Front, EKS, ECS, RDS, Threat detection and other security controls) or Azure services (such as AKS, ACR, Entra ID Network Security Group) or Azure services (such as AKS, ACR, Entra ID Network Security Group).

  • Must have 2+ scripting and programming experience (Python, PowerShell, Bash). Experience or knowledge of Observability tools: Datadog, CloudWatch, Azure Monitor

  • Experience or knowledge with Version Control and CI/CD (Git/ Azure DevOps / JFrog Artifactory) 

  • Experience or expertise in writing Infrastructure as Code (IaC) (Terraform / CloudFormation / Argo CD / other)

  • Knowledge of VMware, MS SQL will be advantageous.

Similar Openings for You