Lead Engineer, Senior
qualcomm
Job Description
- Develop and run long-duration stability tests (MTBF/soak) and stress/fuzzing frameworks.
- Implement fault-injection and recovery validation for runtime and serving layers.
- Build automation pipelines and workload generators for diverse AI models.
- Instrument metrics and observability tools; perform RCA for complex failures.
- Collaborate with cross-functional teams to improve stability and release readiness.
Qualifications
- Bachelor’s/Master’s in CS/EE or equivalent.
4+ yrs of experience - Strong systems background: Linux, memory management, concurrency.
- Hands-on with AI inference frameworks (e.g., vLLM, ONNX Runtime, TensorRT).
- Proficiency in Python/C++ for automation and debugging.
- Experience in stability testing, fuzzing, and fault-injection.
Preferred:
- Familiarity with sanitizers (KASAN/ASAN), Valgrind, CI/CD, Kubernetes.
- Knowledge of quantization, PEFT/LoRA, and multi-tenant serving strategies.