Director - Software Engineering
oraclecloud
Job Description
- Lead a team of engineers and grow the technical backbone of our organization.
- Implement software development practices to build observability, alerting, tracing, automation and self-healing capabilities to maintain the highest levels of platform availability.
- Performance tune and enhance the reliability of the infrastructure stack, for both public and private cloud.
- Hands on contribution to enterprise solutions, tooling, and initiatives leveraging your technical experience.
- Nurture an environment of innovation and continuous improvement, leading changes that drive efficiencies into existing engineering and delivery processes.
- Lead experimentation and proof of concepts of new open-source technologies to solve observability, testing and resiliency challenges. Influence the technology adoption for the Customer Journey organization and broader company platforms.
- Implement shift left automated testing to prevent defects from reaching production.
- Ensure all new critical subsystems, microservices, databases and external calls meet the 5 9's availability requirement.
- Provide consultation for all significant functionality changes and peer review critical production hotfixes.
- Conduct technical code reviews and drive innovation across the organization to adopt industry best practices.
- Be part of a global operations team that support a 24/7 model, willingness to work holidays and weekends.
Qualifications
Minimum Qualifications:
- 8+ years of experience in large-scale software engineering, reliability engineering, or production operations within complex, distributed environments.
- Proven leadership experience managing global, geographically distributed teams providing 24x7x365 production support for mission-critical systems.
- Strong technical foundation in distributed systems, cloud-native architectures, and high-availability platforms.
- Demonstrated ability to drive engineering-led operations, including reliability engineering, incident management, and operational excellence at scale.
- Hands-on experience with automation, infrastructure-as-code, CI/CD pipelines, and observability platforms (monitoring, logging, tracing).
- Experience designing or operating systems that leverage predictive insights, automation, and self-healing patterns to reduce operational toil.
- Ability to define and govern SLOs, SLIs, and error budgets, and to embed non-functional requirements into system design.
- Strong people leadership skills, including building high-performing teams, mentoring leaders, and fostering a culture of accountability, learning, and psychological safety.
- Proven ability to influence and partner effectively with engineering, product, platform, risk, and executive stakeholders.
- Excellent communication skills with the ability to articulate complex technical and operational topics to both technical and non-technical audiences.
- Bachelor’s degree in Computer Science, Engineering, or equivalent practical experience.