Staff Machine Learning Engineer

wbd

Bangalore 9 Years Exp Posted 248d ago

You will be part of a team focused on re-training, model hosting, cost optimization, and managing production workflows at scale.

Build and maintain pipelines for model fine-tuning and retraining, including LoRA-based workflows
Integrate and maintain vector search services and semantic similarity infrastructure
Design scalable model serving solutions for open-source and foundation models
Develop systems for experiment tracking, model versioning, and evaluation
Monitor production models for drift and performance degradation
Manage compute cost and resource optimization across distributed training jobs
Integrate Human-in-the-Loop (HITL) workflows and offline labeling into training pipelines
Support model deployment for varied model architectures, including Vision-Language Models, Convolutional Neural Nets, and Embedding Generation models
Stand up and maintain Feature Store and data versioning infrastructure
Architect and implement RAG pipelines for video metadata, summarization, and Q&A
Build evaluation frameworks to assess LLM performance, hallucination frequency, and structured response accuracy

What to Bring:

9+ years of experience in machine learning engineering, with end-to-end ML workflow expertise
Strong background in model retraining, fine-tuning, and evaluation techniques
Experience deploying and managing open-source model servers (e.g., Triton, TorchServe, Ray Serve)
Proficient in managing cost-effective distributed computing environments (e.g., Kubernetes, Ray, SageMaker)
Familiar with experiment tracking tools (e.g., MLflow, Weights & Biases) and model versioning strategies
Deep understanding of ML domains including NLP, RecSys, and reinforcement learning
Experience with real-time inference systems and streaming data pipelines is a plus
Familiarity with labeling tools, HITL workflows, and offline data curation strategies
- Comfort working in Agile development environments and collaborating across global teams