SRE Practice Leader
nichehrglobal
Job Description
-
Define, develop, and scale SRE strategies, frameworks, and best practices across large, complex environments.
-
Drive customer transition from traditional managed services toward engineering-led, reliability-first, automation-driven operating models, including AI-led SRE implementations.
-
Architect and build highly available, resilient, scalable, and self-healing systems across distributed and cloud-native environments.
-
Establish automation-first approaches for provisioning, configuration management, deployment pipelines, and operational workflows.
-
Lead the implementation of advanced observability, including metrics, logs, traces, APM, and modern alerting practices that support proactive reliability.
-
Apply cloud-native technologies such as Kubernetes, containers, serverless, and service mesh to build high-performance, decoupled architectures.
-
Integrate intelligent automation, AI/ML insights, and automated incident workflows to improve MTTR and reduce manual toil.
-
Optimize cloud resource utilization using data-driven approaches to enable cost efficiency, elasticity, and predictive scaling.
-
Partner closely with enterprise architecture teams to embed SRE principles into core technology strategies, platforms, and operating models.
-
Drive standardization by defining SRE reference architectures, engineering guidelines, runbooks, and reusable patterns.
-
Ensure SRE frameworks integrate seamlessly with existing systems, business domains, and modernization roadmaps.
-
Evaluate emerging technologies and guide their adoption within engineering and operations ecosystems.
-
Participate in presales, showcasing engineering depth through solution proposals, demos, benchmarks, and proofs of concept.
-
Advise customers through assessments, maturity roadmaps, and tailored SRE modernization strategies.
-
Articulate the business value of reliability engineering, observability, and automation in the context of large-scale transformation programs.
-
Lead, mentor, and coach engineering teams to develop deep SRE competencies across automation, observability, performance engineering, and cloud-native practices.
-
Foster strong relationships with clients and internal stakeholders.
-
Collaborate with cross-functional teams across development, architecture, platform engineering, DevOps, and security to deliver unified outcomes.
-
Mentor and guide junior team members.
-