Observability Engineer

blackline

Bengaluru 3 Years Exp Posted 249d ago

Job Description

You'll Get To:
 

  • Ensure 99.99%+ availability of the services and infrastructure that spans across multiple global datacentres in private and public clouds. 
  • Troubleshoot BL container platforms and supporting automation in a highly available, high traffic environment. 
  • Monitor and maintain health, performance, and security of all infrastructure components.  
  • Build systems and perform necessary tasks to deliver against committed project timelines. Desire to automate everything 
  • Solve real-life problems in a bleeding-edge, high-performance, and high-traffic environment. Maintain documentation and operational knowledge base.  
  • Triaging first level events and incidents. 
  • Adhere to the change management and other established processes and procedures. 
  • Respond to and troubleshoot incidents (Incident Management). Conduct root cause analyses. 
  • Evaluate and analyse systems, performance, issues and metrics in order to provide recommendations for continuous improvements. 
  • Adhere to SLA compliance as defined. 
  • Participate in a scheduled 24/7 on-call rotation for second tier support escalations.  
  • Should be willing to work 3 days from office.


What You'll Bring:
 

  • 3 - 6 years industry experience 
  • 3+ years supporting Unix and/or Linux (Ubuntu, CentOS, Redhat) and/or Windows 
  • 3+ years supporting a SaaS/Hosting type critical revenue-generating environment. 
  • 2+ years working with development and continuous integration related tooling (Jenkins, BitBucket, GitHub) 
  • 2+ years working with tools like New Relic, Jira, Foglight. 
  • 1+ years of experience using container platforms and tooling (Kubernetes, Docker, Rancher, Helm, Anthos, Istio, GKE, AKS, etc...)  
  • Experience in hybrid cloud and/or multi-cloud environments (GCP (primary), Azure, AWS, VMWARE) 
  • Understanding of software development processes and methodologies. 
  • Experience with scripting and/or systems programming languages (Bash, PowerShellPython, Golang, C#). 
  • Hands-on problem-solving skills, technical leadership and mentoring qualities. 
  • Strong written and oral communication skills. 
  • Ability to participate in On-Call rotation 
  • A minimum of two years of experience in a 24x7 operations organization, deploying and operating complex cloud infrastructure at scale 
  • 3 days hybrid mandatory.  

Similar Openings for You