Staff Reliability Engineer
gehealthcare
Job Description
Job Description
Roles and Responsibilities
In this role, you will:
- Collaborate with stakeholders to define and document strategies for product performance, scalability, and reliability, with a focus on cost efficiency.
- Operationalize and execute the defined strategies effectively.
- Align planning and execution with program timelines to ensure timely delivery.
- Proactively report on key metrics related to performance, scalability, and reliability.
- Participate in technical discussions, contribute to design reviews, and present ideas through whiteboarding and other collaborative methods.
- Deliver results in a fast-paced, agile environment, ensuring alignment with product release goals.
- Provide and seek feedback on design and development to drive continuous improvement.
- Make informed decisions on technologies and tools related to performance, scalability, and reliability through thorough evaluation and impact analysis.
- Develop a deep understanding of the product architecture, including module interdependencies, while specializing in performance, scalability, and reliability.
- Provide technical leadership in defining, developing, and evolving performance and reliability testing frameworks using modern technologies across cloud and on-premises environments.
- Define observability metrics, logs, and error tracking mechanisms to measure and improve system observability, performance and reliability.
- Design and implement resiliency frameworks to simulate real-world failure scenarios and enhance system robustness.
Education Qualification
- Bachelor’s degree in computer science or in STEM majors.
- Minimum 10 years of experience
Desired Characteristics & Technical Expertise:
- Hands on experience in creating and executing automation scripts in JMeter with protocols like:Selenium web driver or playwright ,Java Sampler.
- Ability to write code in Core Java or Python
- Ability to learn new technologies and implement solutions to problems to meet the product goals
- Must have experience is one of these tools AWS FIS, Chaos Mesh or Chaos Toolkit
- Must have experience working with Docker and Kubernetes
- Ability to engage and perform deep level performance analysis using metrics and logs for application deployed on AWS Cloud
- AWS certified.