Site Reliability Engineer (Shell scripting)
FIS
Job Description
-
Understanding project KPIs, SLI's, SLO's, MTTD, MTTR, Error budgets, Chaos engineering and eliminating TOILs by automation
-
Exploring observability tools and creating/implementing dashboards
-
Run the production environment by monitoring availability and taking a holistic view of system health
-
Incident Management: Knowledge in handling incidents, participating in blameless postmortem, performing root cause analysis, and implementing post-incident reviews.
-
Develop scripts to reduce toil and automate repetitive tasks, issues resolution scripting.
-
Measure and optimize system performance, with an eye toward pushing our capabilities forward, getting ahead of customer needs, and innovating for continual improvement