Sr DevOps
turbohire
Job Description
Key Responsibilities
Infrastructure Design & Automation
- Architect and maintain hybrid and on-premises environments for compute and storage workloads.
- Deploy and manage containerized microservices using Kubernetes (on-prem and cloud).
- Automate infrastructure provisioning using Terraform, Ansible, or similar tools.
- Configure and scale distributed data streaming and storage platforms (eg: Kafka clusters, Apache Spark, and data lake/warehouse).
Scalability & Performance
- Design for elastic scaling, load balancing, and resource isolation for data services and APIs.
Security & Governance
- Secure hybrid deployments with VPNs, firewalls, and identity management.
- Implement data access controls, audit trails, and encryption across services and storage layers.
- Manage secrets securely across environments.
Monitoring, Logging & Resilience
- Set up observability stacks (Prometheus, Grafana, ELK, OpenTelemetry) for real-time insight.
CI/CD & Operations
- Build pipelines to deploy services, data jobs, and configuration changes to hybrid infra.
- Maintain and enhance CI/CD pipelines for zero-downtime deployments and rollbacks.
- Support SRE practices including incident management, post-mortems, and runbook creation.
Required Skills
- Bachelor’s or Master’s in Computer Science, Engineering, or related field.
- 8+ years in DevOps and SRE roles.
- Strong experience with hybrid and on-prem Kubernetes deployments.
- Proficiency in infrastructure automation (Terraform, Ansible, Helm).
- Experience with data processing and microservices orchestration, supporting real-time data platforms in hybrid or on-premises environments.
- Solid understanding of networking, storage, firewalls, and security for hybrid deployments.
- Programming/scripting experience with Bash, Python, or Go.
- Strong grasp of DevOps best practices, observability, and system reliability engineering.
- Experience with AWS/Azure and On-Prem