Site Reliability Engineer
morningstar
Job Description
Basic Qualifications:
- Bachelor's degree or higher or previous experience in a technical support role.
- 1-2 years of experience
Responsibilities:
- Maintain and support the products and data systems: proactively monitor events, investigate issues, analyze solutions, and drive problems through to resolution.
- Engage in continuous improvement and innovation within the team.
- Define requirements and develop tools, alerting, and reporting as needed.
- Work with the product team to define application hardening and define opportunities for chaos engineering.
- Use operational tools and monitoring platforms to gain in-depth knowledge and understanding of system availability, performance, and capacity.
- Work with business partners to establish Service Level Indicators and Objectives (SLIs and SLOs).
- Implement alerting strategy that makes alerts actionable and unique.
- Provide follow-through to ensure issues are resolved to satisfaction.
- Assist with server management and support.
- Implement best practices, develop efficiencies, and improve department’s scalability.
- You will be required to be part of an on-call rotation.
- You will work very closely with other groups to resolve problems, deploy and release new products and create solutions to provide world-class service, solutions, and support.
- Embrace collaboration, open communication, and reach across functional borders.
Requirements:
- Basic understanding of build, release, and configuration management practices.
- Strong analytical, interpersonal, written and verbal communication skills.
- Passionate about staying current on trends and best practices in software development, devops, and Site Reliability Engineering.
- Knowledge of IT environments, technologies, system hosting infrastructure, including cloud providers and their platforms.
- General knowledge of source control tools (Git, CodeCommit, etc.) and branching strategies.
- General knowledge of build/release tools (Jenkins, uDeploy, CodeBuild, CodeDeploy, CodePipeline, etc.).
- Basic understanding of monitoring strategies and tools.
Bonus Qualifications
- Demonstrable understanding of Infrastructure as Code (Chef, Puppet, Ansible, Salt Stack, Terraform).
- Experience programming in one or more popular languages such as Python, Java, C#, JavaScript, Ruby, PowerShell, etc.
- System administration experience with Windows.
- Working experience with Amazon Web Services.