Senior AI Systems Engineer
bamboohr
Job Description
Job Duties and Responsibilities
- Design and build agentic evaluation pipelines: error detection → root cause → hypothesis generation → prompt variant testing → A/B measurement → production promotion, with minimal human intervention.
Own the accuracy measurement infrastructure — automate error analysis, data quality pipelines, and batch evaluation frameworks across document types and customer configurations. - Build and evolve internal accuracy tooling from manual utilities into automated improvement platforms — classification and extraction correction loops, NTP rule generation, performance reporting.
- Take prototype methodologies and productionize them into reliable, scalable systems the team can operate independently.
- Build LLM-based extraction and classification pipelines using few-shot and RAG strategies for complex, real-world document types.
- Design and maintain A/B testing infrastructure for prompt and model changes — no untested changes go to production.
- Create live dashboards tracking extraction accuracy, NTP rates, and false positive rates across document types and customer configurations.
- Optimize LLM costs while maintaining quality: prompt compression, output token minimization, model selection and migration strategies.
- Write production-grade data pipelines with error handling, retries, logging, and monitoring.
- Collaborate with platform engineering and applied research functions on architecture and methodology translation.
- Mentor 1–2 junior engineers; build tooling and documentation they can operate independently.