Senior Site Reliability Engineer

nordex

Chennai, India 8 Years Exp Posted 10d ago

Job Description

  • Strong understanding of Java applications, JVM behavior, memory management, garbage collection, and tuning
  • Ability to read and debug Java services to support incident response
  • Strong experience deploying, operating, and debugging workloads on Microsoft Azure, including:
  • Azure Kubernetes Service
  • Azure Virtual Machines
  • Azure Application Gateway / Load Balancers
  • Azure Monitor, Log Analytics, Alerts, Dashboards
  • Azure Key Vault
  • Azure Networking basics (VNets, subnets, NSGs, Private Endpoints)
  • Experience with observability stacks such as Prometheus + Grafana, OpenTelemetry, ELK, Loki, or Azure-native logging
  • CI/CD pipelines for Java applications
  • Canary releases, rolling updates, blue/green deployments
  • Automated rollback mechanisms
  • Artifact storage and versioning
  • Experience defining SLIs, SLOs, and SLAs for Java services
  • Strong communication during incidents (clear, calm, structured)
  • Ability to collaborate with Java developers, DevOps, and platform teams
  • Documentation writing (runbooks, RCAs, reliability guidelines)
    • Continuous improvement mindset

Similar Openings for You