Software Engineering

ripplehire

Bengaluru, India 5 Years Exp Posted 1h ago

Job Description

System Reliability Availability Design and maintain faulttolerant highavailability architectures across AWS Azure and GCP Implement redundancy load balancing and automated failover strategies

Cloud Infrastructure Management Deploy manage and optimize cloud resources using IaC tools such as Terraform Ansible

Monitoring Observability Implement monitoring ing and logging frameworks using Splunk Azure monitor Dynatrace AWS cloud watch or similar to detect and resolve issues proactively

Incident Management Lead realtime incident response rootcause analysis and postmortems to continuously improve uptime and resilience

Capacity Planning Scaling Predict traffic patterns optimize resource utilization and enforce autoscaling and performance best practices

Automation Tooling Develop scripts and internal tooling for automating routine tasks to reduce manual intervention Languages may include Python Power Shell or Bash

Security Compliance Collaborate with security teams to implement secure infrastructure practices including encryption rolebased access auditing and vulnerability management

Collaboration Mentorship Work across engineering and DevOps teams providing guidance on reliability best practices and mentoring junior SREs

Similar Openings for You