Senior Site Reliability Engineer

truecaller

Bengaluru NM Years Exp Posted 234d ago

Job Description

What you bring in:

  • Extensive knowledge of system administration on Linux environments, preferably working on high throughput and low latency systems.
  • Strong hands-on experience with GCP services (or transferable AWS/Azure skills) — networking, IAM, compute, storage, Kubernetes (GKE/EKS/AKS).
  • Extensive knowledge of Docker and Kubernetes.
  • Excellent understanding of distributed system design across process and site boundaries.
  • Hands-on experience with service orchestration, management, deployment activities, configuration management and all necessary automation.
  • Strong grasp of process isolation and containerization concepts, being able to apply them when necessary.
  • Container orchestration: Deep understanding of Kubernetes — deploying, scaling, monitoring clusters.
  • Monitoring & Observability: Experience with tools like Prometheus, Grafana, Stackdriver, Datadog, New Relic, etc.
  • Incident management: Practical experience responding to incidents, performing root cause analysis, and improving system reliability.
  • Security best practices: Knowledge of cloud security, secrets management, and compliance basics.
  • Good understanding of software development lifecycle, versioning, building, testing, staging and deployment processes with a strong continuous delivery mindset.

The impact you will create:  

  • Building tooling to ease the provisioning and scaling of infrastructure resources.
  • Continuously improve and scale infrastructure components to handle growth.
  • Improve overall systems performance and investigate failures taking part actively in future improvements discussion.
  • Ensure systems availability, reachability, and maintainability building the necessary instrumentation, tooling, and alarming systems in order to escalate abnormalities.
  • Being influential in monitoring and capacity planning together with the application development teams and in alignment with the business goals.

It would be great if you also have:

  • Experience developing kubernetes operators.
  • Experience deploying and scaling apache cassandra, scylladb, mysql, postgresql, redis or memcached.
  • Go programming language experience or willingness to learn coding in Go(it'll help us build new k8s operators and improve the existing ones).

Similar Openings for You