Associate Platform Reliability Engineer
jefferies
Job Description
Key Responsibilities
- Collaborate with a high-performing global team to maintain plant stability across middle-office and operations applications.
- Lead incident triage, root cause analysis, and communication, with a strong focus on problem management.
- Partner with regional teams to drive technical and functional initiatives.
- Identify and eliminate manual support tasks through automation; develop tools for deployment, management, and service visibility.
- Design and implement robust monitoring and alerting systems using platforms like AppD and OpenTelemetry.
- Work closely with engineering teams to support system architecture, schema design, and performance tuning.
- Troubleshoot issues across the full technology stack.
Qualifications
- Bachelor’s degree in Computer Science, Engineering, or a related field.
- Minimum 3 years of experience in programming with Python, Go, C/C++, or C#.
- Strong foundation in computing fundamentals—data structures, algorithms, and software design.
- Experience in application design, maintenance, and support.
- Solid understanding of SRE principles and practices.
- Proficiency in Linux/Unix and Windows Server environments.
- Hands-on experience with scripting, databases, and troubleshooting application/data access issues.
- Self-driven with a strong sense of ownership and commitment to quality.
- Excellent communication skills, able to engage both technical and business stakeholders.
- Familiarity with open-source platforms such as Redis, MongoDB, Kafka, and Elasticsearch.
- Experience configuring observability stacks (Grafana, Prometheus, Jaeger, Loki).
- Exposure to DevOps tools and technologies including Git, Jenkins, Ansible.