AI ML Validation Engineer
quality
Job Description
AI Validation & Testing
Chatbot & RAG Testing • LLM/Prompt Evaluation • Python Validation
- AI Solution Testing: Validates AI-powered solutions including chatbots and generative AI features for accuracy, reliability, and edge cases
- RAG Validation: Tests Retrieval-Augmented Generation pipelines for retrieval relevance, grounding, and hallucination handling
- Agentic AI: Builds and tests agentic solutions such as MCP and prompting
Test Data & AI Evaluation
Data Quality • AI Eval Metrics • Responsible AI
- Test Data Management: Curates and versions golden/evaluation datasets and validates data pipelines for integrity, completeness, and drift across training and inference
- AI Evaluation: Hands-on with evaluation frameworks and metrics such as RAGAS, DeepEval, and LLM-as-judge, covering groundedness, relevance, and precision/recall
- Responsible AI & Privacy: Bias and fairness checks, safety/guardrail, and PII/data-leakage validation
Languages & Frameworks
Python • Pytest
- Core Languages: Proficient in Python for automation
- Test Frameworks: Hands-on with Pytest, or equivalent
- Framework Design: Build, scale, and maintain automation frameworks end-to-end
Secondary Skill
UI & API Testing
UI Automation • REST/GraphQL APIs • Contract Testing
- UI Automation: End-to-end browser testing across modern web applications. Well versed with BDD frameworks like Cucumber/SpecFlow
- API Testing: Validates REST/GraphQL APIs across microservices for correctness and edge cases
- Cross-layer Coverage: Bridges UI and API layers for integrated test coverage
- Tools: Experience with tools like Postman/Insomnia/Newman
Cloud, Infrastructure & CI/CD
AWS • SQL • CI/CD Pipelines
- Cloud Exposure: Working knowledge of AWS/Azure services for test environments
- SQL Proficiency: Uses SQL to query, validate, and debug data confidently
- CI/CD Integration: Integrates test suites into Jenkins, GitHub Actions, GitLab CI, etc.