QA Engineer
binary
Job Description
What you'll own
Test infrastructure
-
Build and maintain automated test suites across our Python backend services
-
Integrate tests into CI as mandatory PR gates
-
Own the test pyramid: unit tests for service logic, integration tests against real infrastructure, E2E tests as a release gate before production promotion
API & backend testing
-
Test the core REST API across auth, org and workspace management, data sources, query/retrieval, inference, jobs, and billing surfaces
-
Test async workflows — background ingestion jobs, status polling, lifecycle edge cases
-
Test RBAC: role hierarchy, org and workspace membership, permission boundaries
AI / retrieval quality testing
-
Build an evaluation harness for retrieval quality — the platform's core value is citation-accurate answers over indexed repos and datasheets; this needs repeatable benchmarks, not manual spot-checking
-
Test ingestion output correctness across the artifact formats produced by the pipeline
-
Regression-test LLM routing — model selection, per-user budget enforcement, token tracking
Frontend & client testing
-
Test the web UI: auth flows, workspace and data source management, query and chat surfaces
-
Test the IDE extension and CLI agent: key user flows, error handling, session persistence
-
Cross-browser baseline and basic accessibility
Multi-environment & release gates
-
Own the preprod QA gate: validate each release candidate before it promotes to production
-
Regression test the on-prem enterprise deployment on each release cycle
-
Maintain environment-specific test configs across dev, preprod, prod, and on-prem targets
Required experience
-
3+ years in a QA / SDET role, with time at an early-stage company where you built test infrastructure rather than inherited it
-
Python testing — comfortable with a framework like pytest; can read and reason about service code
-
REST API testing — auth flows (JWT, OAuth2), async job polling, edge case coverage
-
CI integration — writing tests that run reliably in automated pipelines, managing test environments and secrets
-
Clear thinking about test strategy: what to automate vs. not, where to draw the unit/integration/E2E boundary
Strong-to-have
-
Experience evaluating AI/LLM output quality — retrieval benchmarks, answer correctness scoring, regression detection for non-deterministic systems
-
Frontend and E2E testing — browser automation tools (Playwright, Cypress, or similar), component testing
-
IDE extension or CLI tool testing
-
Database and async queue debugging — tracing a failing job through the system end-to-end
-
Multi-environment testing discipline across dev, preprod, prod, and on-prem targets
-
Familiarity with embedded systems or safety-critical software domains — our customers work in those contexts and their quality bar is high
-