Principal Cloud Engineer - Data & AI
equinix
Job Description
Responsibilities
Cloud Architecture & Engineering
-
Deep expertise in designing, implementing, and managing architectures across multiple cloud platforms (e.g., AWS, Azure, GCP)
-
Proven experience in architecting hybrid and multi-cloud solutions, including interconnectivity, security, workload placement, and DR strategies
-
Strong knowledge of cloud-native services (e.g., serverless, containers, managed databases, storage, networking)
-
Experience with enterprise-grade IAM, security controls, and compliance frameworks across cloud environments
AI & GenAI Platform Integration
-
Integrate LLM APIs (OpenAI, Gemini, Claude, etc.) into platform workflows for intelligent automation and enhanced user experience
-
Build and orchestrate multi-agent systems using frameworks like CrewAI, LangGraph, or AutoGen for use cases such as pipeline debugging, code generation, and MLOps
-
Experience in developing and integrating GenAI applications using MCP and orchestration of LLM-powered workflows (e.g., summarization, document Q&A, chatbot assistants, and intelligent data exploration)
-
Hands-on expertise building and optimizing vector search and RAG pipelines using tools like Weaviate, Pinecone, or FAISS to support embedding-based retrieval and real-time semantic search across structured and unstructured datasets
Engineering Enablement
-
Create extensible CLIs, SDKs, and blueprints to simplify onboarding, accelerate development, and standardize best practices
-
Streamline onboarding, documentation, and platform implementation & support using GenAI and conversational interfaces
-
Collaborate across teams to enforce cost, reliability, and security standards within platform blueprints.
-
Work with engineering by introducing platform enhancements, observability, and cost optimization techniques
-
Foster a culture of ownership, continuous learning, and innovation
Automation, IaC, CI/CD
-
Mastery of Infrastructure as Code (IaC) tools — especially Terraform, Terragrunt, and CloudFormation / ARM / Deployment Manager
-
Experience building and managing cloud automation frameworks (e.g., using Python, Go, or Bash for orchestration and tooling)
-
Hands-on experience with CI/CD pipelines (e.g., GitHub Actions) for cloud resource deployments
-
Expertise in implementing policy-as-code & Compliance-as-code (e.g., Open Policy Agent, Sentinel)
Security, Governance & Cost
-
Strong background in implementing cloud security best practices (network segmentation, encryption, secrets management, key management, etc.).
-
Experience with multi-account / multi-subscription / multi-project governance models, including landing zones, service control policies, and organizational structures
-
Ability to design for cost optimization, tagging strategies, and usage monitoring across cloud providers
Monitoring & Operations
-
Familiarity with cloud monitoring, logging, and observability tools (e.g., CloudWatch, Azure Monitor, GCP Operations Suite, Datadog, Prometheus)
-
Experience with incident management and building self-healing cloud architectures
Platform & Cloud Engineering
-
Develop and maintain real-time and batch data pipelines using tools like Airflow, dbt, Dataform, and Dataflow/Spark
-
Design and develop event-driven architectures using Apache Kafka, Google Pub/Sub, or equivalent messaging systems
-
Build and expose high-performance data APIs and microservices to support downstream applications, ML workflows, and GenAI agents
-
Architect and manage multi-cloud and hybrid cloud platforms (e.g., GCP, AWS, Azure) optimized for AI, ML, and real-time data processing workloads
-
Build reusable frameworks and infrastructure-as-code (IaC) using Terraform, Kubernetes, and CI/CD to drive self-service and automation
-
Ensure platform scalability, resilience, and cost efficiency through modern practices like GitOps, observability, and chaos engineering
Leadership & Collaboration
-
Experience leading cloud architecture reviews, defining standards, and mentoring engineering te