Senior PlatformOps Manager
jfrog
Job Description
As the Senior PlatformOps Manager at JFrog, your responsibilities will include:
- Leading a dedicated team of PlatformOps engineers to manage 24/7/365 operations and service delivery of JFrog SaaS and platform services ensuring continuous availability, performance and reliability.
- Collaborating closely with SRE, Product and Cloud platform engineering teams to identify issues, implement enhancements and adhere operations to adhere SaaS uptime & SLO commitments.
- Own the SaaS incident management, trend optimization and the maintenance of runbooks and SOPs while driving continuous improvements in incident detection, response and resolution processes.
- Planning and executing various SaaS service capability rollouts with sharp attention to detail ensuring clear definitions of done and acceptance criteria are met.
- Envisioning and executing the roadmap for PlatformOps towards the establishment of the JFrog SaaS command center expanding on Service charter and portfolio offerings.
- Effectively facilitate the onboarding and integration of new service operations and instrumentations effectively.
- Leading initiatives to reduce toil and drive automation efforts within the team.
- Mentoring and coaching PlatformOps tech leads & engineers on technical matters and professional development fostering a culture of excellence and continuous improvement.
- Reviewing and periodically publishing operational KPIs to monitor team performance and service levels.
To be a Senior PlatformOps Manager in JFrog you should have:
- A Bachelor’s degree in Computer Science, Engineering, or a related field (or equivalent experience).
- A minimum of 5 years of techno-functional managerial experience in CloudOps, SaaSOps, NoC or a comparable role in a SaaS environment.
- Practical experience with at least one cloud infrastructure (AWS, GCP, Azure), container orchestration (Kubernetes, Docker) as well as programming languages (Python/Go).
- Familiarity with GitOps, standard CI/CD practises and source control management tools (e.g., Git, Bitbucket) for automation initiatives is a plus
- Proficiency in alerting and observability tools (Opsgenie, New Relic, Coralogix, etc.) and familiarity with AIOps technologies.
- Experience working in mission-critical environments and the ability to manage multiple priorities under pressure in a technically challenging setting.
- Demonstrated strong leadership and collaboration skills with experience working effectively across globally distributed teams coupled with excellent communication abilities to engage with both business and technical senior management.
- Proven analytical and troubleshooting skills in incident management, monitoring, and performance tuning of cloud-native SaaS systems.
- A proactive mindset focused on continuous improvement and effective problem-solving.
- Availability for on-call after-hours support as needed.