Site Reliability Engineer
ascendion
Job Description
Develop,
deploy, and maintain scalable and highly available systems on Kubernetes.
Design and implement automation
processes for system deployments and scaling.
Monitor system performance,
troubleshoot issues, and ensure continuous system improvements.
Collaborate with development teams
to enhance the infrastructure required for their needs, including CI/CD
pipelines.
Respond to and resolve operational
incidents, providing comprehensive incident reports and leading
post-mortems.
Manage code deployments, fixes,
updates, and related processes on multiple environments.
Required Skills:
Strong experience with
containerization and orchestration using Docker and Kubernetes.
Proficient in the use of
infrastructure as code (IaC) tools, particularly Terraform.
Familiarity with artifact
repositories, such as Artifactory, for maintaining build versions.
Ability to write and maintain
scripts in languages such as Python, Bash, or similar for automation
tasks.
Solid understanding of CI/CD principles
and experience in setting up and maintaining pipelines.
Excellent problem solving abilities
with a strong emphasis on troubleshooting and incident resolution
Outstanding communication skills,
capable of effectively collaborating with cross-functional teams.
Monitoring and incident response
with expertise in tools like Kubernetes, Terraform, and scripting languages.