Staff Site Reliability Engineer

okta

Bengaluru 7 Years Exp Posted 494d ago

Job Description

Skills

  • Exceptional communication skills, including technical writing in the English language
  • Systematic problem-solving approach, coupled with a strong sense of ownership and drive
  • Understanding of microservices, cloud infrastructure (AWS, Azure), databases (SQL, No-SQL, Key/Value), containers (docker, kubernetes), web technologies (web sockets, http) and networking (SSL, routing, VPN)
  • Live and breathe SLIs, SLOs, error budgets and SLAs
  • Strong belief in automating everything and reducing toil for yourself and teammates
  • Loves to work as a team, but is able to work effectively in a remote environment where tasks may be self-driven

Responsibilities

  • Working with the other teams to run, own and improve incident response processes
  • Participate in regular on-call rotations to ensure 24/7 coverage of all critical systems
  • Use existing monitoring tools to identify problems and resolve and/or escalate to service teams
  • Implement changes to enable or improve infrastructure resilience, monitoring, and alerting

Experience

  • 7+ years as a Site Reliability Engineer or in a Cloud Operations/DevOps role
  • 6+ years using golang, shell scripting and terraform
  • 2+ years as software developer in a SaaS environment
  • 4+ years in a production environment supporting large-scale, mission-critical applications

 

Similar Openings for You