Site Reliability Engineer, AVP

Natwest

Gurugram NM Years Exp Posted 399d ago

You’ll also be:

Conducting capacity planning exercises to make sure cloud resources can handle anticipated traffic spikes and growth
Implementing and maintaining monitoring, logging, and alerting systems to provide insights into cloud infrastructure and applications' health and performance
Delivering automation solutions to minimise and eliminate manual tasks associated with maintaining and supporting the applications
Ensuring an in-depth understanding of the full tech stack on which the application resides and depends on
Identifying alerting and monitoring requirements for an application, based on sound understanding of customer journeys
Evaluating the resilience of the end-to-end tech stack on which the applications depend, and addressing weaknesses
Seeking to reduce frequency of hand-offs in the end-to-end resolution of customer-impacting incidents