Site Reliability Engineer III
hsbc
Job Description
In this role, you will:
- Strong experience in product design, configuration, code deployment, performance tuning, issue Has experience and affinity to improve team performance
- Active listening skills
- Mindsets and Behaviors/Self-master
- Proven experience in Compute, OpenShift, Kubernetes, Hypervisors, Storage, Windows, Networks and Linux
- Accountability for the control and compliance of the engineering process.
- Promote innovation and adoption of cutting-edge specialist technologies and practices with the domain
- Promote development of engineers through coaching, and mentoring.
- Consult as required in other areas to assist and provide a different perspective to programmed or projects that require it.
To be successful in this role, you should meet the following requirements:
- Hands-on experience with OpenShift and Kubernetes administration.
- Understanding of distributed systems and common distributed system failure domains Experience managing a production service with RedHat, Windows and ESXi.
- Strong knowledge of Linux systems and networking.
- Experience with monitoring, logging, alerting & Observability tools (e.g., Otel, Prometheus, Grafana, Slunk etc.).
- Proficiency in scripting languages Python, Shell, Go Lang, Terraform etc
- Familiarity with CI/CD tools (e.g., Jenkins, GitLab CI)
- Understanding of containerization (Docker) and microservices architecture.
- Ansible – Configuration Management and Deployment.
- Good problem-solving and communication skills.