Senior Full Stack Engineer (AI)
instahyre
Job Description
Responsibilities:
- AI Platform and Pipelines: Design and build production RAG systems, agentic workflows, and LLM orchestration layers. Pick the right models, chain them, ground them.
- Evals and Model Quality: Build rigorous evaluation frameworks. Define metrics that matter, run systematic benchmarks, and continuously improve model accuracy, latency, and cost.
- Scalable Infra: Architect backend services (APIs, queues, vector stores) that handle millions of legal documents without falling over. Own deployment, monitoring, and cost optimisation on cloud (AWS/GCP).
- Product Engineering: Ship user-facing features with React/Next.js . You care about what the attorney actually sees and uses, not just what the model outputs.
- Data and Embeddings: Work with massive structured and unstructured legal datasets. Build and maintain embedding pipelines, chunking strategies, and retrieval infrastructure.
- MLOps and Lifecycle: Own the full model lifecycle: fine-tuning, deployment, A/B testing, monitoring, and rollback.
Requirements:
- 5+ years shipping production software; strong Python plus at least one of Go, Rust, or Kotlin.
- Hands-on LLM experience: you've built real applications against OpenAI, Anthropic, or open- source models (not just called an API once).
- Deep understanding of RAG architectures, prompt engineering, embeddings, and retrieval systems.
- Solid backend engineering: APIs, microservices, distributed systems, SQL/NoSQL databases.
- Cloud deployment experience (AWS/GCP/Azure), you've scaled services, not just deployed demos.
- Strong CS fundamentals: data structures, algorithms, system design.
- Frontend proficiency: React.js / Next.js or equivalent. You can build an entire AI application, not just a backend.
Bonus points:
- Experience with vector databases (Pinecone, Weaviate, FAISS, Qdrant).
- MLOps tooling: MLflow, Airflow, Kubeflow, or similar.
- Real-time data pipelines or streaming systems.