Site Reliability Engineer - O/S

spglobal

Gurgaon 7 Years Exp Posted 459d ago

Responsibilities:

Gather and analyze metrics from operating systems as well as applications to assist in performance tuning and fault finding.
Partner with development teams to improve services through rigorous testing and release procedures.
Participate in system design consulting, platform management, and capacity planning.
Create sustainable systems and services through automation.
Balance feature development speed and reliability with well-defined service-level objective
Day to day management of VMC/AWS Infrastructure
Build and document automation processes for Infrastructure as a Service/Infrastructure as code.
Backup and Patch management

What We’re Looking For:

We are looking for someone who has.

Bachelor’s degree (or equivalent) in computer science or related discipline with at least 7 years of experience
Proactive approach to identifying problems, performance bottlenecks, and areas for improvement.
Strong interpersonal skills, analytical and problem-solving ability along with strong written and verbal communication.
Solid understanding and hands-on experience with container orchestration.
This role is responsible for configuring, deploying, maintaining, troubleshooting, and monitoring container orchestration on AWS.
Ability to communicate ideas in both technical and non-technical ways.
A strong capacity for teamwork and a sense of ownership and able to work independently and be self-driven.
Hands on Experience with Linux Server, AD, LDAP, DNS, Network Storage, AWS Compute services (EC2, FSX, Managed AD, Route 53, etc…)
Ability to program using scripting with tools or languages, such as PowerShell, Python, Ansible, Terraform and Bash
Familiarity with ITSM processes like Incident, Problem and Change Management using ServiceNow (preferable)