Site Reliability Engineer 3
oracle
Job Description
Responsibilities
- Responsible for overseeing the Dev and Operations processes. Collaborate with Dev / Test teams to integrate code and test in lower env. Promote the release to higher environment using promotion tools.
- Identify opportunities for automation, design and implement robust deployment pipelines, and develop scripts and tools to automate repetitive tasks, which reduce manual effort and minimizing the risk of human errors.
- Implement best practices for monitoring, logging, and alerting to proactively identify and address potential issues before they impact users.
- Understand and communicate the scale, capacity, security, performance attributes, and requirements of the service and technology stack.
- Demonstrate clear understanding of automation and orchestration principles. Act as ultimate escalation point for complex or critical issues that have not yet been documented as Standard Operating Procedures (SOPs).
- Utilize a deep understanding of service topology and their dependencies required to troubleshoot issues and define mitigations.
- Understand and explain the effect of product architecture decisions on distributed systems.
- Professional curiosity and a desire to a develop deep understanding of services and technologies.
Technical Skill
- 5+ years of experience in infrastructure engineering or DevOps roles
- Proficiency in scripting languages such as Bash, Python, or PowerShell for automating tasks and managing infrastructure.
- Strong background on Linux
- Experience on Containerization, Docker, Kubernetes
- Hands-on experience with Kubernetes, including deployment and management
- Familiarity with Helm for managing Kubernetes applications and deployments
- Familiarity with monitoring and logging technologies (e.g., Prometheus, Grafana, Splunk)
- Troubleshooting within Linux and Kubernetes environment during deployments.
- Deep knowledge of Networking (TCP, UDP, DNS, DHCP, IPSec)
- Experience with Terraform
- Hands on expertise on any cloud (AWS, OCI, Azure)
- Thorough understanding of DevOps culture and Agile Methodology.
- Ability to work effectively in a collaborative, cross-functional team environment