Lead - Site Reliability Engineer
fidelity
Job Description
The Value You Deliver
- Passion for technology and the financial domain with demonstrated ability to learn quickly
- Building quality solutions that align with the technology blueprint and best practices to solve business problems by driving design, development and ongoing support.
- Work with our global team and provide technical direction in building solutions.
- Actively participating in knowledge sharing sessions, code and design reviews etc.
The Skills that are Key to this role
Technical / Behavioral
- You have extensive knowledge on Production on-call support for Cloud Infrastructure which is running in EKS platform
- You have extensive experiences in Change, Incident, Problem Management & on-call support
- You have extensive knowledge on observability tools (Preferable - DataDog), Grafana & Prometheus
- You have experience in monitoring various aspects like Log, Metrics, APM, Event, Infrastructure & including of Dashboard creation
- You have experience in multiple AWS services like EC2, EBS, S3, NLB, IAM, Lambda, Cloud-Watch, Cloud Trail & VPC. Rehydration or Patching knowledge's in cloud infrastructure
- You have experience in Microservices Architecture like API Gateway or APIGEE
- You have the ability to triage, execute root cause analysis and be decisive under pressure
- You are capable to work with a variety of individuals and groups, both in-person and virtually, in a constructive and collaborative manner to build and maintain effective relationships
- You are able to do automation using Shell or Python
- You have exposure to CFM/IaaC like Ansible & Terraform
- You know Kafka/MQ administration skill-set
The Skills that are Good To Have for this role
- You know on CI/CD tool like Jenkins
- You are familiar with Redis on AWS would be a plus
- You have exposure to database administration (especially on NoSQL like Mongo or Maria-DB or CrDB)
- You have exposure to document creation for knowledge & process will be an added advantage
- You have the ability to drive the install calls like Monthly, rehydration/Upgradation & DC/DR switch