Site Reliability Engineer

ebayinc

Bengaluru 4 Years Exp Posted 227d ago

Job Description

What you will accomplish: 

  • Proactive Monitoring: Continuously monitor the health of eBay's critical services to identify and address potential issues before they escalate.
  • Solution Development: Collaborate with Architecture, Engineering, and Operations teams to develop solutions that ensure high site availability, reliability and performance.
  • Collaborative Problem Solving: Work closely with partner teams to resolve recurring technical issues, onboard new alerts, and develop high-quality Standard Operating Procedures (SOPs).
  • Enhance Monitoring Tools: Build and improve tools for monitoring and mitigating site incidents, and conduct reliability audits and tests to strengthen eBay’s reliability and incident management capabilities.
  • Incident Management: Act as Incident Commander to drive resolution of major incidents, manage alarms, and ensure effective communication with leadership and partner teams.

 

What you will bring: 

  • 4+ years of professional experience in software engineering, ideally in backend or platform teams
  • Proficiency in one or more programming languages (e.g., Java, Go, Python)
  • Strong incident management and leadership skills, with excellent technical triage and troubleshooting abilities, especially during crises.
  • Familiarity with cloud platforms, container orchestration (e.g., Kubernetes), and infrastructure-as-code tools
  • Experience with observability stacks (e.g., Prometheus, Grafana, ELK, OpenTelemetry)
  • Strong interpersonal and communication skills to thrive in fast-paced, dynamic environments.

Similar Openings for You