Site Reliability Engineer
IBM
Job Description
Key Responsibilities:
- Develop and enhance your technical knowledge via projects and assignments, as well as through IBM’s world class learning platform
- Automate repetitive tasks, processes, and workflows to increase efficiency and reduce human error.
- Set up and maintain monitoring and reporting to uphold compliance posture for IBM Cloud services.
- Look for enhancements and innovative solutions to help the services scale and improve existing technical support tools, procedures, or processes
- Identify and investigate issues, using troubleshooting techniques to drive issues towards a resolution
- Proactively identify issues and improvement opportunities.
- Ensure software meets all requirements of quality, security, modifiability, extensibility etc
- Collaborate with other professionals to determine functional and non-functional requirements for new software or applications
- Work in a global team collaborating with IBMers to share recommendations, solutions and ideas
Required Technical and Professional Expertise
- Experience in programming with Python or Go
- General scripting skills in at least one language
- Understanding of Cloud/DevOps/SRE engineering
- Knowledge of automation and configuration management tools
- Ability to manage multiple tasks, while ensuring that commitments and timetables are met
- Excellent written and verbal communication skills as well as flexibility to work with team members in other time zones