Data Scientist, I
zebra
Job Description
Responsibilities:
- Integrates state-of-the-art machine learning algorithms as well as the development of new methods
- Develops tools to support analysis and visualization of large datasets
- Develops, codes software programs, implements industry standard auto ML models (Speech, Computer vision, Text Data, LLM), Statistical models, relevant ML models (devices/machine acquired data), AI models and algorithms
- Identifies meaningful foresights based on predictive ML models from large data and metadata sources; interprets and communicates foresights, insights and findings from experiments to product managers, service managers, business partners and business managers
- Makes use of Rapid Development Tools (Business Intelligence Tools, Graphics Libraries, Data modelling tools) to effectively communicate research findings using visual graphics, Data Models, machine learning model features, feature engineering / transformations to relevant stakeholders
- Analyze, review and track trends and tools in Data Science, Machine Learning, Artificial Intelligence and IoT space
- Interacts with Cross-Functional teams to identify questions and issues for data engineering, machine learning models feature engineering
- Evaluates and makes recommendations to evolve data collection mechanism for Data capture to improve efficacy of machine learning models prediction
- Meets with customers, partners, product managers and business leaders to present findings, predictions, foresights; Gather customer specific requirements of business problems/processes; Identify data collection constraints and alternatives for implementation of models
- Working knowledge of MLOps, LLMs and Agentic AI/Workflows
- Programming Skills: Proficiency in Python and experience with ML frameworks like TensorFlow, PyTorch
- LLM Expertise: Hands-on experience in training, fine-tuning, and deploying LLMs
- Foundational Model Knowledge: Strong understanding of open-weight LLM architectures, including training methodologies, fine-tuning techniques, hyperparameter optimization, and model distillation.
- Data Pipeline Development: Strong understanding of data engineering concepts, feature engineering, and workflow automation using Airflow or Kubeflow.
- Cloud & MLOps: Experience deploying ML models in cloud environments like AWS, GCP (Google Vertex AI), or Azure using Docker and Kubernetes.Designs and implementation predictive and optimisation models incorporating diverse data types
- Strong in Pytho, Pyspark, SQl
Qualifications:
- Bachelors degree, Masters or PhD in statistics, mathematics, computer science or related discipline preferred
- 0-2 years
- Statistics modeling and algorithms
- Machine Learning experience including deep learning and neural networks, genetics algorithm etc
- Working knowledge with big data – Hadoop, Cassandra, Spark R. Hands on experience preferred
- Data Mining
- Data Visualization and visualization analysis tools including R
- Work/project experience in sensors, IoT, mobile industry highly preferred
- Excellent written and verbal communication
- Comfortable presenting to Sr Management and CxO level executives
- Self-motivated and self-starting with high degree of work ethic