Applied AI Scientist, GenAI and ML Prototyping

dayforcehcm

Noida, UP, IN 4 Years Exp Posted 43d ago

Tech Stack & Technical Requirements

Proficiency in Python as the primary language for data science and ML development (Pandas, NumPy, Scikit-learn)
Familiarity with SQL for data querying and manipulation across modern data warehouses (e.g., BigQuery, Snowflake, PostgreSQL)
(Nice to have) Working knowledge of deep learning frameworks such as PyTorch or TensorFlow for model experimentation

Hands-on experience working with large language model APIs, including providers such as OpenAI (GPT-4o), Anthropic (Claude), or Google (Gemini)
Strong command of prompt engineering techniques, including few-shot prompting, chain-of-thought reasoning, and structured output design
Experience with open-source LLMs (e.g., Mistral, LLaMA) and an understanding of when to apply open vs. proprietary models

Practical experience building RAG (Retrieval-Augmented Generation) pipelines, including chunking strategies, embedding models, and retrieval tuning
Familiarity with agentic orchestration frameworks such as LangChain, LangGraph, LlamaIndex, CrewAI, or AutoGen
Experience integrating vector databases (e.g., pgvector, Pinecone, Weaviate, ChromaDB) into search and retrieval workflows
Understanding of tool/function calling patterns for LLM-driven automation

Ability to define and implement "good enough" metrics and evaluation frameworks for POC validation
Experience with LLM evaluation libraries such as RAGAS, TruLens, or DeepEval
Familiarity with experiment tracking tools such as MLflow or Weights & Biases
Comfort with cost and latency profiling of LLM-based systems to inform feasibility decisions

Comfortable working within cloud environments (AWS, GCP, or Azure) for data access, compute, and API integration
Ability to integrate with REST APIs and third-party data sources during prototyping
Proficiency with standard development tools: Git, Jupyter notebooks, VS Code
Basic familiarity with Docker for packaging and sharing POC environments with engineering teams

4+ years of experience in data science, machine learning, or a closely related field, with a demonstrated track record of delivering end-to-end projects
2+ years of hands-on experience working with large language models or Generative AI solutions in a professional setting
Proven experience taking projects from business problem discovery through to a working prototype or proof of concept
Experience engaging directly with non-technical business stakeholders to gather requirements, set expectations, and communicate results clearly
Strong background in traditional ML approaches (classification, regression, clustering, NLP) alongside modern LLM-based methods

Bachelor's degree in Computer Science, Statistics, Mathematics, Engineering, or a related quantitative field
A Master's or PhD is a plus, though equivalent industry experience is equally valued

Ability to translate complex technical outputs into clear business value — you are as comfortable in a boardroom as you are in a notebook
Strong stakeholder management skills, including the ability to set realistic expectations around LLM capabilities, limitations, and cost trade-offs
Excellent written communication skills fo