Senior Site Reliability Engineer
adp
Job Description
What you'll do:
- Serve as subject matter expert in monitoring and observability.
- Design and implement the tools to improve the reliability and efficiency of Lifion services and data stores.
- Automate infrastructure and configuration management.
- Assist with all aspects of operational security and compliance.
- Consult in system design to meet reliability and capacity requirements.
- Run software performance analysis and system tuning.
- Plan and execute disaster recovery drills.
- Participate in rotating on-call duties.
- Conduct timely post-mortems of production incidents.
Qualifications you'll need:
Education: Bachelor's degree
Experience:
- Experience developing and monitoring distributed systems.
- Experience with continuous integration and continuous deployment tools.
- Working knowledge of Ansible, Terraform, or another centralized configuration management tools.
- Experience with Linux system administration.
- Fluency in any one scripting language, such as Python, Go, Perl, Ruby, Bash, and Java.
- Systematic problem solving approach with excellent debugging and trouble shooting skills
- Strong communication skills
- Strong sense of ownership, and an ability to drive tasks to completion.
- Experience using Docker and container orchestration technologies like Docker Swarm, Kubernetes, or Mesos.