LEAD DATA SCIENTIST - Python
happiestminds
Job Description
Key Responsibilities
Model & Pipeline Development
Build and deploy multimodal ML models across:
- Natural Language Processing (NLP)
- Computer Vision (CV)
- OCR and document understanding
Develop robust pipelines for:
- Text processing, entity extraction, and classification
- Image tagging, moderation, and visual understanding
- Speech-to-text and speaker-level analysis
- Implement Retrieval-Augmented Generation (RAG) pipelines with text and multimodal indexing
- Optimization & Performance Engineering
- Optimize model inference for latency, throughput, and cost efficiency across batch and near real-time workloads
Apply optimization techniques including:
- Batching and asynchronous inference
- Quantization, pruning, or distillation
- GPU and accelerator utilization tuning
- Analyze and troubleshoot model performance in production environments
- MLOps, LLMOps & Deployment
Build and maintain CI/CD pipelines for ML workloads using:
- GitHub Actions, Azure DevOps, or Jenkins
Deploy models as cloud-native microservices, leveraging:
- Docker, Kubernetes (AKS) and FastAPI
Use Azure Machine Learning for:
- Experiment tracking
- Model registry
- Training pipelines and deployment
Implement monitoring and observability for models and pipelines:
- Metrics, logging, alerts, and drift detection (e.g., Prometheus, Grafana)
- Application & Platform Integration
Integrate AI capabilities into enterprise applications such as:
- Search and recommendation systems
- Knowledge, document, or content platforms
- Auto-tagging, summarization, transcription, and moderation workflows
- Design and expose inference and retrieval APIs for downstream consumption
- Collaborate with backend, data, and platform teams to ensure scalable and secure AI integrations
- Collaboration & Mentorship
- Partner with product managers, data scientists, and engineers to translate business requirements into deployable AI solutions
- Review code, promote best practices, and mentor junior engineers
- Contribute to reusable components, documentation, and engineering standards.