AI Platform Administrator
arm
Job Description
Responsibilities:
- Administer and support AI platforms, including Databricks Mosaic AI, cloud AI services, and COTS AI products.
- Run infrastructure provisioning, scaling, and monitoring for AI workloads.
- Implement and maintain CI/CD pipelines for AI applications using Azure DevOps or equivalent.
- Ensure platform security is compliant with ARM and industry standard
- Support AI Developers and ML Engineers with environment setup, configuration, and fix.
- Run API gateways, MCP servers, and integrations for custom AI solutions.
- Automate operational tasks (monitoring, logging).
- Maintain platform documentation, runbooks, and incident response processes.
- Collaborate with cloud, data, and security teams to align AI platform operations with enterprise standards.
Required Skills and Experience:
- Demonstrable experience( 2 to 4 Years) in system administration and/or DevOps engineering.
- Hands-on experience with Azure DevOps (repos, pipelines, releases).
- Familiarity with cloud platforms (Azure, AWS, GCP) and their AI/ML services.
- Experience handling Databricks or similar data/AI platforms.
- Proficiency with infrastructure as code (Terraform, Bicep, Azure Resource Manager, or CloudFormation).
- Experience with monitoring and observability tools (Dynatrace, Cloud-native equivalents).
- Familiarity with identity and access management (IAM) and platform security practices.
- Solid understanding of APIs, integrations, and service administration.
“Nice To Have” Skills and Experience:
- Exposure to MCP servers and multi-agent orchestration frameworks.
- Experience with logging/alerting in AI contexts (e.g., ML observability tools).
- Familiarity with cost optimization for AI/ML workloads in the cloud!
- Knowledge of Microsoft 365.
- Knowledge of COTS AI systems (e.g. OpenAI Enterprise, Anthropic, MS Co-pilot, or equivalent).
- Background in supporting hybrid platforms (mix of vendor solutions and custom deployments)!