Junior Site Reliability Engineer
Nextiva
Job Description
Key Responsibilities
- Triage, troubleshoot, and fix production problems in every layer of the stack, with a focus on Oracle and billing systems
- Design, develop, improve, and tune logging, monitoring, and alerting
- Create actionable alerts to fix system outages before they occur
- Write software to improve reliability and recoverability of production systems
- Identify manual work, document the fix in the form of a runbook, then automate it away
- Perform and automate system administration tasks
- Participate in 24/7 on-call rotation supporting production systems
Qualifications
- Bachelor’s degree in Computer Science or related field, or equivalent work experience
- 0-2 years of Oracle systems experience
- 0-2 years of software development experience
- 0-2 years of Linux system administration experience
- 0-2 years of performance engineering experience
- Understanding and experience working with RESTful APIs
- Experience with triaging troubleshooting complex systems
- Experience working with source control
- Experience with containerization and container orchestration
- Experience with application performance monitoring
- Experience with web technology components including relational and SQL Databases, Apache, Tomcat, Java, packet monitoring
- Experience with microservice environments and distributed systems
- Familiarity with front-end technologies
- Ability to clearly communicate technical concepts
- Understanding of general SRE concepts and DevOps principles
- Familiar with the SIP concepts and troubleshooting