Senior Software Engineer - Site Reliability Engineering

pantheon

Bangalore NM Years Exp Posted 572d ago

If all of this sounds interesting to you, read on!

Working on advanced globally scaled implementations of WordPress and Drupal CMS systems using the latest in Google Cloud platform offerings.
Working on a large scale orchestration platform serving millions of containers, using lower level Linux systems like systemd/cgroups directly.
Administering , developing and maintaining standardization and configuration state management with Kubernetes, Chef, Terraform, GCP Tooling , Vault etc.

Close collaboration with the wider engineering team to both deliver platform improvements and provide subject-matter-expertise for other technical initiatives
Owning your team’s production systems, measure and track their health with SLO’s, and assist our dedicated support team to resolve production issues
Continuous improvements to our standard of engineering excellence by implementing best practices for coding, testing, deploying and communication
Supporting Pantheon as a member of the on-call engineer rotation, contributing to the infrastructure’s stability, reliability, and performance that drives Pantheon's success.

Supporting and meeting with Pantheon customers, as needed, to ensure their success as well as ours.

What you need to Succeed

Strong understanding and work experience developing with either Python, GoLang or any object oriented programming language.
Strong understanding and working knowledge of Kubernetes, Terraform, CI/CD pipelines , Release Engineering practices .
Strong understanding of Linux operating systems administration.
Work-related experience with large-scale, high-traffic platforms.
Work-related experience with designing scalable and robust services in the real world.
Clear communication skills and the ability to represent your contributions and ideas with clarity while remaining open and giving space to the contributions and ideas of others.
Participate in system design consulting, platform management, and capacity planning.
Developing and maturing sustainable systems and services through automation and uplifts.
Balance feature development speed and reliability with well-defined service-level objectives.
Extensive experience supporting livesite and on call.
Experience building and operating complex observability tooling like Grafana Cloud , Prometheus etc.

Bonus Points

Working knowledge of Cassandra, MySQL, Redis
Working knowledge of React, Node.js, Python, Go,
Working knowledge of Docker, Chef, CircleCI, Vault.
Working knowledge of Wordpress, Drupal.
Coding experience beyond simple scripts.
CKA, CKAD, CKS or CNCF Certifications
Experience supporting and developing Open Source tooling on public clouds like GCP, AWS or Azure.