Principal Machine Learning Engineer
equinix
Job Description
-
Design, develop, and deploy machine learning and Large Language Model (LLM)–based solutions for production use cases
-
Collaborate with Generative AI Center of Excellence leaders and business stakeholders to evaluate buy vs. build decisions for generative AI applications
-
Build and integrate agent-based workflows using platforms such as Google Agentspace, Microsoft Copilot, and Salesforce Agentforce
-
Develop end-to-end ML pipelines, covering data ingestion, feature engineering, model training, evaluation, deployment, and monitoring
-
Architect and implement LLM-powered systems that integrate agents and services across multiple cloud platforms into a unified solution
-
Optimize ML workflows for performance, scalability, reliability, and cost efficiency in cloud environments (GCP, Azure, AWS)
-
Implement and maintain MLOps best practices, including CI/CD, model versioning, experiment tracking, and automated retraining
-
Work extensively with deep learning frameworks such as PyTorch and TensorFlow
-
Containerize ML services and deploy them using Docker, Kubernetes, App Engine, or virtual machines
-
Apply strong knowledge of NLP fundamentals, including transformers, attention mechanisms, embeddings, and text preprocessing
-
Deploy and manage models in production, conduct A/B testing, and measure performance improvements using statistical methods
-
Develop features, run experiments, analyze results, and translate insights into actionable improvements
-
(Good to have) Build and deploy classical ML models (regression, classification, clustering), NLP applications (sentiment analysis, summarization, Q&A, chatbots, information retrieval), and computer vision solutions (image classification, object detection, segmentation using models such as YOLOv7, DDRNet, RFTM with datasets like COCO and Cityscapes)
Qualifications
-
PhD with 5+ years, Master’s with 6+ years, or Bachelor’s with 7+ years of experience in Machine Learning, Computer Science, Data Science, or a related field
-
Strong proficiency in Python for machine learning and production systems
-
Solid understanding of software engineering fundamentals, system design, and design patterns
-
Hands-on experience with at least one major cloud platform (GCP, Azure, or AWS)
-
Experience building and deploying production-grade ML systems
-
Strong communication skills with the ability to explain technical concepts and results to both technical and non-technical stakeholders
-
Excellent time management, collaboration, and organizational skills