Site Reliability Engineer

bnpparibas

Mumbai 15 Years Exp Posted 202d ago

Job Description

Direct Responsibilities

 

·Automate away toil using a combination of scripting, tooling, and process improvements

·Drive transformation strategies involving infrastructure hygiene / end of life

·Implementing new technologies or processes to improve efficiency and reduce costs eg:- CI/CD implementation

·Monitoring system performance and capacity levels to ensure high availability of applications with minimal downtime

·Investigating any service disruptions or other service issues to identify their causes

·Performing regular audits of computer systems to check for signs of degradation or malfunction

·Developing and implementing new methods of measuring service quality and customer satisfaction

·Conducting capacity planning to ensure that new technologies can be accommodated without impacting existing users

·Conducting post-mortem examinations of failed systems to identify and address root cause

·Drive various Automation, Monitoring & Tooling common purpose initiatives across CEP APS and other teams within CIB APS

·Accountable for generation, reporting and improvements of various Production KPIs, SLs and dashboards for APS teams

·Accountable for  improvements in service and presentations for all governances and steering committees

·Accountable for maintenance and improvement of IT continuity plans (ICP)

 

Contributing Responsibilities

 

Technical & Behavioral Competencies

·Strong knowledge of DevOps methodology and toolsets

·Strong knowledge of Cloud based applications/services

·Strong knowledge of APM Tools i.e. Dynatrace / AppDynamics

·Strong Distributed Computing and Database technologies skillset

·Strong knowledge of Jenkin, Ansible, Python, Scripting etc.

·Good understanding of Log aggregators i.e. Splunk/ELK

·Good understanding of observability tools i.e. Grafana / Prometheus

·Ability to work with various APS, Development, Operations stakeholders, locally and globally

  • Dynamic, proactive and teamwork oriented
  • Independent, self-starter and fast learner
  • Good communication and interpersonal skills
  • Practical knowledge of change, incident & problem management tools
  • Innovative and transformational mindset
  • Flexible attitude
  • Ability to perform under pressure
  • Strong analytical skills

Preferred to have

·ITIL

·Dockers/Kubernetes

·Prior knowledge on Site Reliability Engineering / Dev-Ops / Application Production Support / Development background

Similar Openings for You