Site Reliability Engineer
Natwest
Job Description
Your role will also involve:
- Collaborating with product development and feature teams to understand the upcoming product, enabling continuous integration and continuous deployment to occur
- Regularly attending the feature teams’ refinement and planning sessions
- Identifying areas for service improvement by analysing and diagnosing re-occurring platform and service incidents, as well as customer and stakeholder feedback
- Building a culture of continuous improvement to reinforce the robustness of the domain, with a focus on automation, scalability, continuous integration and continuous delivery
The skills you'll need
We’re looking for someone with technical knowledge and experience including platform, technology, products and domains. This is a individual contributor role, must have the capability of performing independent POCs and working with cross functional departments along with the below Tech skills.
Bachelor’s degree (B.E. / BTech. preferable) with overall 12+ years of strong experience in DevSecOps & SRE experience in production support. Ability to communicate at all levels. Proven experience in managing large-scale distributed systems and understanding the principles of scalability and reliability. Ownership of DevOps DORA metrics, SRE TOIL reduction – with automation.
We’re also looking for:
-
Experience in security tools like SAST, DAST, container security. Understanding of Node.js, React.js, JAVA, Oracle, IDMC.
-
Experience in Infra as Code like Terraform, CloudFormation.
-
Experience in container technologies like Docker, Kubernetes, OpenShift. Must have knowledge of DevSecOps tools like Git, Maven, Selenium, Jenkins, Ansible, Security Tool.
-
Anyone of the Monitoring tools knowledge Geneos, Nagios, Prometheus, DynaTrace, AppDynamics, DX-APM, SPLUNK. Scripting Knowledge: UNIX Shell, (Python groovy, YAML ((good to have)).
-
Experience and understanding in at least one cloud provider like AWS, Azure etc. On demand Infra provisioning – environment spinoffs – environment cloning – EKS, IAAC
-
Working hands-on knowledge of configuring SLA, SLO, SLIs and infra + business rules/logics in AppDynamics, AWS CW, PingDom, DataDog, Tivoli etc (APM – preferably).
-
Understanding network protocols, load balancing, and firewall management for secure and efficient network operations.
-