Senior Data Scientist

idfcfirst

Mumbai, India 5 Years Exp Posted 42d ago

Job Description

Key / Primary Responsibilities:

  • Lead cross-functional teams in the design, development, and deployment of Generative AI solutions, with a strong focus on Large Language Models (LLMs).
  • Architect, train, and fine-tune state-of-the-art LLMs (e.g., GPT, BERT, T5) for various business applications, ensuring alignment with project goals.
  • Deploy and scale LLM-based solutions, integrating them seamlessly into production environments and optimizing for performance and efficiency.
  • Develop and maintain machine learning workflows and pipelines for training, evaluating, and deploying Generative AI models, using Python or R, and leveraging libraries like Hugging Face Transformers, TensorFlow, and PyTorch.
  • Collaborate with product, data, and engineering teams to define and refine use cases for LLM applications such as conversational agents, content generation, and semantic search.
  • Design and implement fine-tuning strategies to adapt pre-trained models to domain-specific tasks, ensuring high relevance and accuracy.
  • Evaluate and optimize LLM performance, including handling challenges such as prompt engineering, inference time, and model bias.
  • Manage and process large, unstructured datasets using SQL and NoSQL databases, ensuring smooth integration with AI models.
  • Build and deploy AI-driven APIs and services, providing scalable access to LLM-based solutions.
  • Use data visualization tools (e.g., Matplotlib, Seaborn, Tableau) to communicate AI model performance, insights, and results to non-technical stakeholders.

 

Secondary Responsibilities: 

  • Contribute to data analysis projects, with a strong emphasis on text analytics, natural language understanding, and Generative AI applications.
  • Build, validate, and deploy predictive models specifically tailored to text data, including models for text generation, classification, and entity recognition.
  • Handle large, unstructured text datasets, performing essential preprocessing and data cleaning steps, such as tokenization, lemmatization, and noise removal, for machine learning and NLP tasks.
  • Work with cutting-edge text data processing techniques, ensuring high-quality input for training and fine-tuning Large Language Models (LLMs).
    • Collaborate with cross-functional teams to develop and deploy scalable AI-powered solutions that process and analyze textual data at scale.

Similar Openings for You