DevOps Lead.
proximus
Job Description
- The responsibilities may include, but are not limited to:
- Manage and support hybrid infrastructure environments across On-Premises and AWS platforms including Red Hat OpenShift Container Platform (OCP).
- Deploy, monitor, and troubleshoot containerized workloads on Kubernetes and OpenShift clusters.
- Handle networking configurations and troubleshooting across cloud and on-premises environments including VPCs, routing, DNS, load balancers, firewalls, VPNs, and ingress traffic management.
- Handle Kubernetes/OpenShift cluster administration activities including scaling, upgrades, networking, storage, ingress, and workload management.
- Collaborate with infrastructure and application teams for migration and deployment of microservices across cloud and on-premises Kubernetes/OCP environments.
- Implement Infrastructure as Code (IaC) using Terraform and manage AWS infrastructure effectively.
- Troubleshoot complex system issues, including API tracing, log analysis, and performance bottlenecks in a microservices architecture.
- Collaborate with cross-functional teams to implement security best practices, including secret management, IAM policies, and network security configurations.
Job Profile:
- Ensure Fault Management, Configuration Management, and Performance Management for production systems.
Provide hands-on support for cloud-based services on AWS and on-prem Datacenter environments. - 8-12 years of experience in DevOps or Cloud Engineering roles.
- Hands-on experience with Kubernetes (K8s) and Red Hat OpenShift Container Platform (OCP) administration and operations.
- Strong understanding of Kubernetes/OpenShift architecture, cluster management, networking, ingress controllers, persistent storage, and security policies.
- Knowledge of OpenShift CI/CD integrations, container registry management, and deployment automation.
- Strong networking knowledge including TCP/IP, DNS, HTTP/HTTPS, Load Balancers, Reverse Proxies, VPNs, Routing, NAT, Firewalls, Security Groups, and network troubleshooting.
- Strong experience in AWS (EC2, S3, RDS, DynamoDB, Aurora, Lambda, API Gateway, SQS, Elasticache, Fargate, Route53, Secrets Manager, VPCs, Security Groups).
- Strong knowledge of Linux systems administration, shell scripting, and database queries (PostgreSQL, MongoDB, or similar).
- Experience in CI/CD tools and automation frameworks.
- Knowledge of security practices, vulnerability management, and best practices in cloud environments.
- Experience in troubleshooting Kubernetes/OpenShift production issues including pod failures, resource bottlenecks, networking, and cluster performance tuning.
- Familiarity with incident management tools, monitoring, logging, and alerting systems.
- Prior experience working in messaging or communication platforms is a plus.
- AWS Associate Certification mandatory.
- Strong team player with good communication, negotiation, and problem-solving skills.
- Must be available on weekends depending on emergency duty assignment (on rotation basis).