Applied AI Scientist, GenAI and ML Prototyping
dayforcehcm
Job Description
Tech Stack & Technical Requirements
Core Languages & Frameworks
-
Proficiency in Python as the primary language for data science and ML development (Pandas, NumPy, Scikit-learn)
-
Familiarity with SQL for data querying and manipulation across modern data warehouses (e.g., BigQuery, Snowflake, PostgreSQL)
-
(Nice to have) Working knowledge of deep learning frameworks such as PyTorch or TensorFlow for model experimentation
LLM & Generative AI Tooling
-
Hands-on experience working with large language model APIs, including providers such as OpenAI (GPT-4o), Anthropic (Claude), or Google (Gemini)
-
Strong command of prompt engineering techniques, including few-shot prompting, chain-of-thought reasoning, and structured output design
-
Experience with open-source LLMs (e.g., Mistral, LLaMA) and an understanding of when to apply open vs. proprietary models
Agentic Orchestration & RAG
-
Practical experience building RAG (Retrieval-Augmented Generation) pipelines, including chunking strategies, embedding models, and retrieval tuning
-
Familiarity with agentic orchestration frameworks such as LangChain, LangGraph, LlamaIndex, CrewAI, or AutoGen
-
Experience integrating vector databases (e.g., pgvector, Pinecone, Weaviate, ChromaDB) into search and retrieval workflows
-
Understanding of tool/function calling patterns for LLM-driven automation
Evaluation & Experimentation
-
Ability to define and implement "good enough" metrics and evaluation frameworks for POC validation
-
Experience with LLM evaluation libraries such as RAGAS, TruLens, or DeepEval
-
Familiarity with experiment tracking tools such as MLflow or Weights & Biases
-
Comfort with cost and latency profiling of LLM-based systems to inform feasibility decisions
Data & Infrastructure
-
Comfortable working within cloud environments (AWS, GCP, or Azure) for data access, compute, and API integration
-
Ability to integrate with REST APIs and third-party data sources during prototyping
-
Proficiency with standard development tools: Git, Jupyter notebooks, VS Code
-
Basic familiarity with Docker for packaging and sharing POC environments with engineering teams
Required Experience
-
4+ years of experience in data science, machine learning, or a closely related field, with a demonstrated track record of delivering end-to-end projects
-
2+ years of hands-on experience working with large language models or Generative AI solutions in a professional setting
-
Proven experience taking projects from business problem discovery through to a working prototype or proof of concept
-
Experience engaging directly with non-technical business stakeholders to gather requirements, set expectations, and communicate results clearly
-
Strong background in traditional ML approaches (classification, regression, clustering, NLP) alongside modern LLM-based methods
Education
-
Bachelor's degree in Computer Science, Statistics, Mathematics, Engineering, or a related quantitative field
-
A Master's or PhD is a plus, though equivalent industry experience is equally valued
Soft Skills & Ways of Working
-
Ability to translate complex technical outputs into clear business value — you are as comfortable in a boardroom as you are in a notebook
-
Strong stakeholder management skills, including the ability to set realistic expectations around LLM capabilities, limitations, and cost trade-offs
-
Excellent written communication skills fo