DevOps Administrator
freshteam
Job Description
What You'll Do
- Administer and maintain cloud infrastructure on Azure (VMs, Container Apps, AKS, Storage, Networking, Key Vault)
- Own CI/CD pipelines (GitHub Actions, Azure DevOps) — build, test, deploy, rollback
- Manage Docker image builds, registries (ACR), and multi-arch deployments (linux/amd64)
- Configure and harden NGINX, SSL/TLS certs (Let's Encrypt / Azure-managed), reverse proxies, and load balancers
- Monitor systems with Prometheus, Grafana, Azure Monitor, Log Analytics; set up alerts and on-call runbooks
- Automate provisioning with Terraform / Bicep / Ansible
- Manage secrets (Azure Key Vault), environment variables, and config across dev/UAT/prod
- Administer Linux servers (Ubuntu) — patching, user access, firewall (ufw/NSG), Tailscale/VPN
- Handle database ops (PostgreSQL, Supabase) — backups, restores, migrations, performance tuning
- Respond to incidents, perform RCA, and drive post-mortems
- Enforce security best practices — IAM, RBAC, least privilege, vulnerability scanning, dependency audits
- Support developers with deployment issues, env setup, and tooling
What We're Looking For
- 4+ years as a DevOps / SRE / Cloud / Systems Administrator
- Strong Azure experience (VMs, Container Apps, AKS, Networking, Key Vault, Monitor)
- Proficient with Docker, container registries, and image lifecycle management
- Hands-on with CI/CD (GitHub Actions or Azure DevOps Pipelines)
- Solid Linux administration (Ubuntu preferred) and shell scripting (Bash)
- Experience with NGINX, SSL certificate management, and DNS
- Infrastructure-as-Code: Terraform or Bicep
- Scripting in Python or Go for automation
- Working knowledge of PostgreSQL administration
- Git fluency and multi-account/multi-provider credential handling
- Strong troubleshooting skills under production pressure
Nice to Have
- Kubernetes (AKS) operational experience
- Experience with Supabase, FastAPI, or Next.js application stacks
- Familiarity with Tailscale or zero-trust networking
- Certifications: AZ-104, AZ-400, CKA
- Exposure to LLM/AI application deployments (Anthropic, OpenAI API infra)
- Experience running cost optimization exercises on Azure
What Success Looks Like (First 90 Days)
- Own and document the deployment pipeline for all active production services
- Zero-downtime deploys across UAT and PROD environments
- Reduce deployment failures by establishing pre-flight checks (disk, env vars, platform flags)
- Stand up unified monitoring and alerting across all VMs and Container Apps
- Create runbooks for the top 10 recurring operational issues