Associate DevOps Platform Kubernetes Engineer
athenahealth
Job Description
Primary Responsibilities:
Kubernetes Deployment & Automation
· Design, deploy, and manage highly available and scalable Kubernetes clusters on AWS EKS using Terraform and/or Crossplane.
· Implement Infrastructure-as-Code (IaC) best practices for managing EKS clusters and related infrastructure.
Kubernetes Operations & GitOps
· Configure and maintain Kubernetes deployments, services, ingresses, and other resources using YAML manifests or GitOps workflows.
· Implement GitOps practices with FluxCD for automated deployments and configuration management of containerized applications.
Reliability, Security & Scalability
· Proactively ensure the reliability, security, and scalability of AWS production systems, with a particular focus on Kubernetes clusters and containerized applications.
· Resolve complex problems across multiple platforms and application domains, using advanced system troubleshooting techniques.
Operational Support & Monitoring
· Provide primary operational support and engineering expertise for all cloud and enterprise deployments, with a focus on Kubernetes.
· Monitor system performance, identify downtime incidents, and diagnose underlying causes, particularly related to Kubernetes cluster and container health.
Cost Optimization
· Design and develop cost-effective Kubernetes solutions within allocated budgets, ensuring efficient resource utilization.
Secondary Responsibilities:
Collaboration & Process Improvement
· Work closely with developers, testers, and system administrators to ensure smooth deployments and operations of containerized applications.
· Champion the implementation of new processes, tools, and methodologies to enhance efficiency throughout the software development lifecycle (SDLC) and pipeline management.
Security Integration
· Integrate robust security measures into the development lifecycle, considering the specific security requirements of containerized applications.
Typical Qualifications:
· 3 to 5 years of experience building, scaling, and supporting highly available systems and services.
· 2+ years of experience managing and operating Kubernetes clusters in production.
· Proven experience in building and managing AWS platforms, with a strong focus on Amazon EKS (Elastic Kubernetes Service).
· Deep knowledge of Kubernetes architecture, core concepts, best practices, and security considerations.
· Expertise in Infrastructure-as-Code (IaC) tools like Terraform and Crossplane.
· Familiarity with GitOps principles and experience with FluxCD (a plus).
· Proficiency in at least one scripting/programming language (Python, Go, Ruby, Shell).
· Experience in Site Reliability Engineering (SRE) and DevOps principles, including CI/CD and version control (Bitbucket, GitHub, etc.).
· Familiarity with telemetry, observability, and modern monitoring tools (Prometheus, Alertmanager, Grafana, etc.), particularly for Kubernetes monitoring.
· Strong expertise in system visibility to facilitate rapid detection and resolution of issues within Kubernetes clusters.