Observability Engineer

standardchartered

Bangalore 5 Years Exp Posted 229d ago

Job Description

Job Summary

•    As an Observability Engineer with a specialization on the Grafana stack, you will play a critical role in making the internal state of the market's infrastructure and services visible to stakeholders for troubleshooting, performance analysis, capacity planning, and reporting with a focus on telemetry solutions.
•    You will develop platforms and tooling to enable developers and operators to efficiently trace performance problems to their source and map their application performance to business  objectives using traces.
•    You will assist teams in instrumenting their applications and systems to generate and utilize traces. 
•    You will engineer the standardization and adoption of observability tools for the Infrastructure departments including Platform, Database, Reliability, and Cloud Operations teams, as well as developer teams.
Key Responsibilities
•    Design and build an observability infrastructure for all engineering teams to consume.
•    Develop and improve instrumentation for monitoring and logging the health and availability of services.
•    Design and develop tools for metric collection, analysis, and reporting.
•    Educate and lead efforts to improve observability among all engineering teams.
•    Work with teams to enable an effective and pleasant on-call experience.
•    Identify and collect the appropriate measurements, and synthesize the correct queries, to show intuitive  and insightful visualizations which characterize the behavior of complex systems.
•    Build a metrics pipeline with end-to-end latency under 5 minutes.
•    Integrate logs with time series data for event correlation.
•    Help us unlock the power of distributed tracing.
•    Proactively monitor systems, networks, and applications to provide input in improving the stability, security, efficiency, and scalability of systems.

Key Responsibilities

Our ideal candidate would have:
•    Familiarity with the Grafana tech stack: Loki, Grafana, Tempo, Mimir/Prometheus
•    In-depth experience designing at-scale monitoring and logging for corporate infrastructure services.
•    5 years experience working in Monitoring / Observability / SRE / DevOps / Performance Tuning.
•    Experience working with cloud infrastructures, particularly Kubernetes and AWS.
•    Experience with Git/version control solutions
•    Experience with programming languages, primarily Go, Rust, Java, Python
•    Experience with CI/CD pipelines like Azure Pipelines, Jenkins
•    Expert-level experience in monitoring and logging technologies, both open source

Similar Openings for You