ML Engineer
smartrecruiters
Job Description
Data Engineering & Feature Development: Build and maintain robust ETL/ELT pipelines using Databricks and PySpark. Design and implement feature engineering workflows; leverage Databricks Feature Store for reusability.
Model Development & Experimentation: Develop ML models (regression, classification, clustering, time-series, NLP) using scikit-learn, XG Boost, Light GBM, PyTorch/TensorFlow. Conduct hyperparameter tuning and model selection using ML flow for experiment tracking and reproducibility.
ML Ops & Production Deployment: Deploy models to production using Databricks Model Serving, AWS Sage Maker (if applicable), or custom Fast API/Flask endpoints. Implement CI/CD pipelines for automated testing, validation, and deployment (GitHub Actions, Jenkins, GitLab CI).
Monitoring & Maintenance: Set up model monitoring dashboards to track performance, data drift, and concept drift. Implement alerting systems for anomalies, errors, and SLA breaches.
Collaboration & Documentation: Partner with data scientists, analysts, product managers, and software engineers to translate business problems into ML solutions. Document architecture, workflows, model decisions, and trade-offs (accuracy vs. latency vs. cost).
Core Technical Skills:
Databricks: PySpark, Delta Lake, Databricks SQL, Feature Store, ML flow integration, cluster optimization.
ML flow: Experiment tracking, model registry, model versioning, deployment workflows.
SQL: Complex queries, window functions, CTEs, query optimization, database design.
Python: Expert-level proficiency in Pandas, NumPy, Polars, scikit-learn, XG Boost, Light GBM, PyTorch/TensorFlow.
Version Control: Git, GitHub/GitLab workflows, branching strategies, pull requests.
Qualifications
Required Qualifications:
Education: Bachelor's or Master's in Computer Science, Data Science, Statistics, Engineering, or related field.
Experience: 3–5 years in ML engineering, data engineering, or software engineering with ML focus.
Desired Qualifications:
Certifications: Databricks Certified Machine Learning Professional, AWS Certified Machine Learning – Specialty (optional).
Domain Knowledge: Experience in agriculture, life sciences, or related industries (for Syngenta context).
Advanced Skills: Distributed training, Kubernetes, real-time streaming ML, LLMs/Transformers (Hugging Face)