Senior Data Scientist
idfcfirst
Job Description
Key / Primary Responsibilities:
- Lead cross-functional teams in the design, development, and deployment of Generative AI solutions, with a strong focus on Large Language Models (LLMs).
- Architect, train, and fine-tune state-of-the-art LLMs (e.g., GPT, BERT, T5) for various business applications, ensuring alignment with project goals.
- Deploy and scale LLM-based solutions, integrating them seamlessly into production environments and optimizing for performance and efficiency.
- Develop and maintain machine learning workflows and pipelines for training, evaluating, and deploying Generative AI models, using Python or R, and leveraging libraries like Hugging Face Transformers, TensorFlow, and PyTorch.
- Collaborate with product, data, and engineering teams to define and refine use cases for LLM applications such as conversational agents, content generation, and semantic search.
- Design and implement fine-tuning strategies to adapt pre-trained models to domain-specific tasks, ensuring high relevance and accuracy.
- Evaluate and optimize LLM performance, including handling challenges such as prompt engineering, inference time, and model bias.
- Manage and process large, unstructured datasets using SQL and NoSQL databases, ensuring smooth integration with AI models.
- Build and deploy AI-driven APIs and services, providing scalable access to LLM-based solutions.
- Use data visualization tools (e.g., Matplotlib, Seaborn, Tableau) to communicate AI model performance, insights, and results to non-technical stakeholders.
Secondary Responsibilities:
- Contribute to data analysis projects, with a strong emphasis on text analytics, natural language understanding, and Generative AI applications.
- Build, validate, and deploy predictive models specifically tailored to text data, including models for text generation, classification, and entity recognition.
- Handle large, unstructured text datasets, performing essential preprocessing and data cleaning steps, such as tokenization, lemmatization, and noise removal, for machine learning and NLP tasks.
- Work with cutting-edge text data processing techniques, ensuring high-quality input for training and fine-tuning Large Language Models (LLMs).
- Collaborate with cross-functional teams to develop and deploy scalable AI-powered solutions that process and analyze textual data at scale.