Staff Reliability Engineer
gehealthcare
Job Description
Roles and Responsibilities
In this role, you will:
-
Collaborate with stakeholders to define and document strategies for product performance, scalability, and reliability, with a focus on cost efficiency.
-
Operationalize and execute the defined strategies effectively.
-
Align planning and execution with program timelines to ensure timely delivery.
-
Proactively report on key metrics related to performance, scalability, and reliability.
-
Participate in technical discussions, contribute to design reviews, and present ideas through whiteboarding and other collaborative methods.
-
Deliver results in a fast-paced, agile environment, ensuring alignment with product release goals.
-
Provide and seek feedback on design and development to drive continuous improvement.
-
Make informed decisions on technologies and tools related to performance, scalability, and reliability through thorough evaluation and impact analysis.
-
Develop a deep understanding of the product architecture, including module interdependencies, while specializing in performance, scalability, and reliability.
-
Provide technical leadership in defining, developing, and evolving performance and reliability testing frameworks using modern technologies across cloud and on-premises environments.
-
Define observability metrics, logs, and error tracking mechanisms to measure and improve system observability, performance and reliability.
-
Design and implement resiliency frameworks to simulate real-world failure scenarios and enhance system robustness.
Education Qualification
-
Bachelor’s degree in computer science or in STEM majors.
-
Minimum 10 years of experience
Desired Characteristics & Technical Expertise:
-
Hands on experience in creating and executing automation scripts in JMeter with protocols like:Selenium web driver or playwright ,Java Sampler.
-
Ability to write code in Core Java or Python
-
Ability to learn new technologies and implement solutions to problems to meet the product goals
-
Must have experience is one of these tools AWS FIS, Chaos Mesh or Chaos Toolkit
-
Must have experience working with Docker and Kubernetes
-
Ability to engage and perform deep level performance analysis using metrics and logs for application deployed on AWS Cloud
-
AWS certified.
Business Acumen:
-
Strong problem-solving abilities and capable of articulating specific technical topics or assignments.
-
Expert in breaking down problems and estimate time for development tasks.
Leadership:
-
Demonstrates clarity of thinking to work through limited information and vague problem definitions.
-
Influences through others; builds direct and "behind the scenes" support for ideas.
-
Proactively identifies and removes project obstacles or barriers on behalf of the team.
-
Shares knowledge, power, and credit, establishing trust, credibility, and goodwill.