Cloud Site Reliability Engineer

zs

pune 1 Years Exp Posted 2h ago

Job Description

  • Analyzing the current state, designing appropriate solutions and working with the team to implement them.
  • Coordinate emergency responses, perform root cause analysis, identify and implement solutions to prevent re-occurrences
  • Work with the team to identify ways to increase MTBF and lower MTTR for the environment
  • Review each entire application stack and execute initiatives to reduce failures, defects and issues with the overall performance
  • Identifying and working with the team to implement more efficient system procedures
  • Maintaining environment monitoring systems to provide the best visibility into the state of the deployed products/solutions
  • Perform root cause analysis on incoming infrastructure alerts and work with teams to resolve them
  • Maintaining performance analysis tools, identifying any adverse changes to performance and working with the teams to resolve them
  • Researching industry trends and technologies, and promote adoption of best-in-class tools and technologies
  • Taking the initiative to advance the quality, performance, or scalability of our Cloud Solutions, by influencing the architecture or design of our products
  • Design, develop and execute automated tests to validate solutions and environments
  • Troubleshoot issues across the entire stack – infrastructure, software, application and network

What you’ll bring:

  • 1+ years’ experience working as a Site Reliability Engineer or an equivalent position
  • 1+ years’ experience with AWS cloud technologies and at least one AWS certifications is required (Solution Architect / DevOps Engineer)
  • 1+ years’ experience functioning as a senior member in an infrastructure/software team
  • Hands-on experience with AWS services like CodeBuild, Config, Systems Manager, ServiceCatalog, Lambda, etc.
  • Full-stack IT experience with *nix, Windows, network/firewall concepts, source control (BitBucket) and build/dependency management and continuous integration systems (TeamCity, Jenkins)
  •  Expertise in at least one scripting language, Python preferred
  •  Firm understanding of networking concepts and technologies
  • Exposure to big data technologies (Spark, Hadoop, Scala, etc.) stack is preferred
  • Good understanding of RDBMS and Cloud Database engines like PostgreSQL, MySQL etc.
  • Basic understanding of Clusters, Load balancers and CDN
  • Experience in fault-tolerant system design
  • Familiarity with Splunk data analysis, Dynatrace APM, or similar tools is a plus
  • A Bachelor’s degree (Master’s preferred) in a related technical field
  • Excellent analytical, troubleshooting and communication skills
  • Possess strong verbal, written and team presentation communication skills. ZS is a global firm; fluency in English is required
  • This role requires healthy doses of initiative and the ability to remain flexible and responsive in a very dynamic environment
  • Ability to quickly learn new platforms, languages, tools, and techniques as needed to meet project requirement
  • Fluency in English
  • Client-first mentality
  • Intense work ethic
  • Collaborative spirit and problem-solving approach

 

Similar Openings for You