Observability Engineer
Deloitte
Job Description
Skills / Project Experience:
- 7+ Years of experience in Observability, DevOps, SRE.
- 5+ Years of experience in AWS services like Lambda, databases, VPC, Route53, EKS, Docker
- And CICD Tools.
- Hands-on expertise with Dynatrace monitoring, log analytics (Grail) and alerting.
- Experience with Terraform for IAC.
- Expretise in incident response and on-call management using bigpanda, Opsgenie and servicenow.
- Architect and implement enterprise-wide Dynatrace observability strategies.
- Lead performance engineering, system reliability, and continuous monitoring improvements.
- Define and enforce SLI, SLO, and SLA policies across applications and infrastructure.
- Drive incident response automation, ensuring P1 and P2 incidents are resolved efficiently.
- Reduce MTTD and MTTR by leveraging AI-driven Dynatrace Davis insights.
- Serve as a technical advisor and escalation point for complex performance issues.
- Collaborate with stakeholders, IT leadership, and DevOps teams to enhance observability practices.
- Mentor junior engineers and conduct training sessions on Dynatrace best practices.
Must Have:
Observability & Monitoring
• Implement monitoring and dashboarding solutions using Dynatrace, ensuring real-time visibility into applications, infrastructure, and services.
• Set up log monitoring using Dynatrace Grail, ensuring comprehensive log analysis and correlation.
• Define and configure custom and default alerts in Dynatrace to detect anomalies and system issues proactively.
• Develop static and dynamic alerting mechanisms to minimize noise while ensuring prompt incident detection.
• Integrate monitoring solutions with BigPanda, enabling AI-driven event correlation and incident response.