Cloud and Observability Engineer- DevOps
coralogix
Job Description
Responsibilities:
- Extension Delivery: Build & enhance quality extension packages for alerts, dashboards and parsing rules in Coralogix Platform to improve monitoring experience for key services using our platform. This would entail -
- Research related to building world class extensions including for container technology, services from cloud service providers, etc.
- Building related Alerts and Dashboards in Coralogix, validating their accuracy & consistency and creating their detailed overviews and documentation
- Configuring Parsing rules in Coralogix using regex to structure the data as per requirements
- Building packages as per Coralogix methodology and standards and automating ongoing process using scripting
- Support internal stakeholders and customers with respect to queries, issues and feedback with respect to deployed extensions
- Migration Delivery: Help migrate customer alerts, dashboards and parsing rules from leading competitive observability and security platforms to Coralogix
- Knowledge Management:
- Build, maintain and evolve documentation with respect to all aspects of extensions and migration
- Conduct training sessions for internal stakeholders and customer on all aspects of the platform functionality (alerts, dashboards, parsing, querying, etc.), migrations process & techniques and extensions content
- Collaborate closely with internal stakeholders and customers to understand their specific monitoring needs, gather requirements, and ensure alignment during the extension building process
- Professional Experience: Minimum 3+ years of experience as a Systems Engineer, DevOps Engineer, or similar roles, with a focus on monitoring, alerting, and observability solutions.
- Cloud Technology Experience - 2+ yrs of hands-on experience with and understanding of Cloud and Container technologies (GCP/Azure/AWS + K8/EKS/GKE/AKS). Cloud Service Provider DevOps certifications would be a plus
- Observability Expertise: Good knowledge and hands-on experience with 2 or more Observability platforms, including alert creation, dashboard creation, and infrastructure monitoring.Researching latest industry trends is part of the scope.
- Deployments & Automation: Good understanding of CI/CD with at least one deployment and version control tool. Engineers would need to package alerts and dashboards as extension packs on an ongoing basis.
- Grafana & PromQL Proficiency: Basic understanding and practical experience with PromQL, Prometheus's query language, for querying metrics and creating custom dashboards. Person would also need to learn Dataprime and Lucene syntax on the job.
- Troubleshooting Skills: Excellent problem-solving and debugging skills to diagnose issues, identify root causes, and propose effective solutions.
- Communication Skills: Strong English verbal and written communication skills to collaborate with the customer's cross-functional teams, deliver training sessions, and create clear technical documentation.
- Analytical Thinking: Ability to analyze complex systems, identify inefficiencies or gaps, and propose optimized monitoring solutions.
- Availability: Ability to also work across US and European timezones
- This is a work from office role