Expert Site Reliability Engineer

finastra

Bangalore 8 Years Exp Posted 579d ago

Objectives of this Role

Work in tandem with our engineering team to identify and implement the most optimal cloud-based solutions for the company.
Define and document best practices and strategies regarding application deployment and infrastructure maintenance.

Provide guidance, thought leadership, and mentorship to development teams to build cloud competencies.
Ensure application performance, uptime, and scale, maintaining high standards of code quality and thoughtful design.
Managing cloud environments in accordance with company security guidelines.
Stay current with industry trends, making recommendations as needed to help the organization innovate and excel.

Responsibilities

Develop, deploy and maintain infrastructure on Azure using Docker and Kubernetes.
Implement automation tools and frameworks (CI/CD pipelines).

Collaborate with team members to improve the company’s engineering tools, systems and procedures, and data security.
Optimize the company’s computing architecture.
Conduct systems tests for security, performance, and availability.
Develop and maintain design and troubleshooting documentation.
Collaborate with the engineering teams to enable their applications to run on Cloud infrastructure.
Debugging technical issues inside a complex stack involving virtualization, containers, microservices, etc.
Troubleshoot incidents, identify root cause, fix and document problems, and implement preventive measures.
Employ exceptional problem-solving skills, with the ability to see and solve issues before they snowball into problems.

Requirements

Bachelor’s degree in computer science, information technology, or mathematics
8+ years of proven experience as a Site Reliability Engineer or similar role in software development and system administration.
Experience in Docker for containerization and application deployment.
Experience with Kubernetes and Helm for orchestration of Docker containers.
Experience with Azure cloud services and understanding of their offerings and architecture.
Knowledge of databases and operating systems.
Ability to troubleshoot complex software and hardware issues.
Knowledge of best practices related to data encryption and cybersecurity.
Excellent problem-solving and communication skills.
Experience in network, server, and application-status monitoring.
Operating systems – any Linux/Unix flavor
Monitoring – Prometheus, Grafana