Devops Engineer
eightfold
Job Description
We are seeking a DevOps Engineer with 3–7 years of experience to help build, operate, and scale our cloud-native platform and automation ecosystem. You will work on CI/CD modernization, container orchestration, cloud infrastructure engineering, and platform tooling that supports mission-critical services running at scale.
This role is ideal for engineers who have hands-on experience with AWS, Azure, Terraform, Kubernetes, CI/CD frameworks, and observability tooling, and who want to grow into large-scale distributed system operations and platform engineering. You will collaborate with senior DevOps engineers, SRE, developers, and security teams to deliver reliable, observable, secure, and resilient infrastructure.
Key Responsibilities
Platform Engineering
• Contribute to the design and development of internal developer platforms (IDP), paved paths, and self-service automation.
• Build tools and abstractions to streamline development workflows and improve engineering productivity.
• Help implement standardized deployment patterns, provisioning automation, and golden pipeline templates.
CI/CD Engineering
• Build, maintain, and enhance CI/CD pipelines using GitHub Actions, Jenkins, and Octopus Deploy.
• Implement automated build, test, and release workflows that improve reliability and consistency.
• Investigate and resolve pipeline issues (agent failures, secrets, rollbacks, artifact handling, environmental drift).
Cloud Infrastructure (AWS & Azure)
• Deploy and maintain cloud infrastructure with strong operational knowledge of:
- EC2, ECS, EKS, Lambda, ALB, VPC
- S3, RDS (Postgres / MySQL / SQL Server), CodeArtifact
- IAM roles & policies, Secrets Manager, Certificate Manager
- CloudFront, Route 53, Systems Manager, EC2 Image Builder
• Ensure cloud environments follow best practices for security, cost efficiency, reliability, and performance.
Containerization & Orchestration
• Deploy and manage workloads using Docker, ECS, EKS, and Kubernetes.
• Support Helm-based deployments, autoscaling, blue/green or canary rollouts, and container lifecycle troubleshooting.
Infrastructure as Code
• Build and maintain infrastructure using Terraform, including modules, environment separation, remote state, and CI integration.
• Apply IaC best practices such as testing, linting, and drift detection.
Observability & Monitoring
• Configure metrics, logs, traces, and dashboards using New Relic or equivalent APM/infrastructure tools.
• Tune alerting thresholds and integrate alerts with PagerDuty for on-call workflows.
• Contribute to SLO/SLA-based monitoring approaches.
Automation & Scripting
• Develop automation tools and tasks using:
- PowerShell
- YAML-based automation pipelines
- C# (for lightweight internal tooling or integrations)
• Reduce manual operational workload through scripting and orchestration.
Operations & Reliability
• Support production systems across Windows and Linux, including services hosted on IIS.
• Participate in incident response, troubleshooting, performance tuning, and operational readiness checks.
• Contribute to initiatives that reduce MTTR and increase reliability.
Disaster Recovery & Business Continuity
• Support DR processes including backup/restore validation, failover testing, and multi-AZ/multi-region readiness.
• Help maintain documentation and automation related to RPO/RTO targets.
Collaboration & Process
• Work within JIRA and ServiceNow for sprint planning, tickets, and incident/change management.
• Collaborate with DevOps, SRE, security, development, and architecture teams to ensure end-to-end delivery quality.
Required Technical Skills
• 3–7 years of DevOps, SRE, platform engineering, or cloud engineering experience.
• Strong experience with CI/CD (GitHub Actions, Jenkins, Octopus Deploy).
• Hands-