TechOps-DE-CloudOpsAMS
ey
Job Description
Your key responsibilities
- Support, maintain & scale automated tools for deployment, monitoring, and operations of the company's systems.
- Troubleshoot and resolve issues in our dev, test, and production environments.
- Enhance the company's infrastructure and application monitoring and alerting systems.
- Drive incident management process and support a blameless post-mortem culture.
- Partner with software engineers to improve services through rigorous testing and release procedures.
- Participate in system design consulting, platform management, and capacity planning.
- Create sustainable systems and services through automation and uplifts.
- Balance feature development speed and reliability with well-defined service level objectives.
Skills and attributes for success
- Experience with distributed systems and microservices architecture.
- Prior involvement with high-scale, high-availability systems.
- Familiarity with database management and SQL/NoSQL databases.
- Certifications in cloud technologies or SRE methodologies.
To qualify for the role, you must have
- Bachelor’s degree in computer science, Engineering, or related field, or equivalent experience.
- Strong experience with Linux/Unix systems and a good understanding of system performance areas.
- Proficiency in one or more of the following: Go, Python, Ruby, Shell scripting.
- Experience with cloud services (e.g., AWS, GCP, Azure) and cloud infrastructure automation tools (e.g., Terraform, Ansible).
- Knowledge of containerization and orchestration technologies (e.g., Docker, Kubernetes).
- Familiarity with continuous integration and deployment methodologies (CI/CD).
- Experience with monitoring and log aggregating frameworks like ELK, Prometheus, Grafana, and Splunk.
- Understanding of networking concepts (e.g., DNS, HTTP, TCP/IP) and load balancing.
- Strong problem-solving skills and ability to work under pressure.