Senior ClouOps Engineer
octave
Job Description
Ensure high availability of production environments by monitoring performance metrics and implementing corrective actions when necessary.
• Ensure that cloud infrastructure adheres to industry best practices for security, including encryption, identity, and access management (IAM), and monitoring.
• Monitor cloud usage and optimize resource allocation to control cloud spending.
Incident Management: Lead incident response efforts, diagnose root causes, and implement long-term solutions to prevent recurrence. Ensure effective communication during outages.
• Troubleshooting and Root Cause Analysis: To investigate and resolve incidents quickly during crisis situations, performing root cause analysis to prevent recurrence.
• Work closely with the team in building a knowledge base publishing a collection of documentation that typically includes answers to frequently asked questions by customers