Site Reliability Engineer
thomsonreuters
Job Description
About the role:
In this opportunity, you will:
-
Be a Professional SRE: Implement site reliability engineering and DevOps best practices. Feed non-functional requirements into the product backlog, such as, but not limited to, high availability, scalability, self-healing, observability, continuous delivery, security
-
Build and maintain monitoring for all aspects of infrastructure, micro-services and the platform and implement Alerting mechanism using cloud native solutions
-
Automate IaC and CICD and promote best practices for our CI/CD processes
-
Provide primary operational support and engineering for multiple large, distributed platforms
-
Act as the go to person for any production issue. Troubleshoot and monitor until successful mitigation, communicate effectively, postmortem and implementation of the learnings.
-
Focus on Continuous improvement and technical standards – drive improvements in productivity, monitoring, tooling and set industry best practices.
-
Act as the go to person for any production issue. Communicate effectively, manage mitigation, remediation, postmortem and implementation of the learnings.
-
On-call Rotation: Participate in on-call/shift rotations (L3). When on-call, you are expected to drive the troubleshooting and mitigation activities while working on incident
-
Be innovative and curious:
-
Maintain end-to-end security ensuring that we meet best practices standards
-
Keep up-to-date with emerging cloud technology trends, especially around DevOps, Service Reliability and Security.
-
Adopt pan-TR operation principles to ensure consistency and efficiency
-
Documenting “tribal” knowledge. Constant upkeep of documentation and runbooks can ensure that teams get the information they need right when they need it
-
Extreme collaboration within our teams – Canada, US, Mexico, and India
About you:
You're a fit for the role if you have:
-
Bachelor's degree in Computer Science or related field - a must
-
Minimum of 3 years of experience as DevOps engineer and/or Cloud engineer with hands on experience in AWS or Azure cloud technologies
-
Highly skilled in Unix/Linux and knowledge (exposure to RHEL)
-
Proven experience in building and operating PRODUCTION cloud native infrastructure, applications and services on AWS or Azure
-
Must have experience with Version Control and CI/CD (Git / CodePipeline or Azure DevOps …)
-
Must have experience with writing Infrastructure as Code (IaC) (Terraform, CloudFormation, or Azure Resource Manager…)
-
Must have scripting and programming experience (Bash, PowerShell, or Python)
-
Experience or knowledge of Distributed logging: ELK, DataDog, CloudWatch, or Azure Monitor
-
Team player with a can-do attitude
-
Rotational Shift