Site Reliability Engineer – Compute Operations
ibm
Job Description
Required technical and professional expertise
• Software and Systems Knowledge: Exposure to software and systems engineering principles, with an understanding of how to design, build, test, and deploy reliable and resilient systems.
• Problem Analysis and Solving: Experience working with problem determination techniques, analyzing complex issues, and providing effective solutions to improve system reliability and resiliency.
• System Maintenance and Deployment: Exposure to system maintenance and deployment practices, including testing and validation to ensure smooth and efficient system operation.
• Business Needs Analysis: Experience working with business requirements analysis, identifying areas for improvement, and providing recommendations for enhancing system reliability and resiliency.
• Information Systems and Ecosystems: Exposure to designing, building, and maintaining well-engineered information systems and ecosystems, with a focus on reliability and resiliency.
Preferred technical and professional experience
• Advanced Scripting Skills: Exposure to advanced scripting languages, such as Python or Perl, to automate system maintenance and deployment tasks.
• Cloud Computing Knowledge: Experience working with cloud computing platforms, including designing and deploying scalable and resilient systems.
• IT Service Management: Exposure to IT service management frameworks, such as ITIL, to ensure alignment with industry best practices for system maintenance and deployment.
Years of Experience:7 - 12