TechOps-DE-CloudOps-Azure
ey
Job Description
Your key responsibilities
- Act as a senior escalation point for Azure infrastructure-related incidents, ensuring timely resolution and operational stability.
- Lead incident response and coordination for Azure infrastructure issues.
- Manage and troubleshoot Azure components including VMs, VNets, Load Balancers, and AKS.
- Conduct log analysis and diagnostics using Azure Monitor and OpenTelemetry.
- Own and execute SOPs and runbooks to manage infrastructure-related requests, issues, and remediation activities.
- Ensure proper access management, including IAM role validation, RBAC, and security configurations.
- Monitor and support containerized environments (AKS, Docker, Helm).
- Create, implement, and manage SSH and PGP/GPG encryption keys within the organization's security framework.
- Generate and manage keys via automation tools and store them securely in Azure Key Vault or Secrets Manager.
- Support key lifecycle operations across Linux environments (e.g., RHEL7).
- Ensure integration of key management with access administration infrastructure.
- Contribute to the design and deployment of IAM capabilities aligned with enterprise security standards.
- Monitor industry trends and assess their impact on key management policies and governance.
- Provide advanced support for key-related issues across platforms and environments.
- Collaborate with engineering and product teams to identify recurring issues and drive SOP/process standardization.
- Provide mentorship and training for junior engineers.
- Participate in shift handovers and governance meetings to ensure knowledge transfer and continuity.
- Proficient in scripting, identity administration tasks and VI Editor.
Skills and attributes for success
- Demonstrated expertise in handling complex troubleshooting and escalation scenarios across Azure infrastructure and IAM key management domains.
- Proficiency in Azure infrastructure components: VMs, VNets, Load Balancers, AKS.
- Experience with Azure Monitor, OpenTelemetry, and other observability tools.
- Familiarity with scripting (Python, PowerShell, Bash) for automation and diagnostics.
- Strong understanding of Azure IAM, RBAC, and cloud security best practices.
- Experience with ITSM tools like ServiceNow for incident and change management.
- Ability to create and refine SOPs, runbooks, and technical documentation.
- Collaborative mindset with strong communication and mentoring skills.
- Deep understanding of cryptographic key management and IAM protocols.
To qualify for the role, you must have
- 5+ years of experience in Azure infrastructure operations, IAM, or cloud support.
- Hands-on experience with Azure services including VMs, VNets, AKS, and AAD.
- Proven experience in managing cryptographic keys and IAM solutions.
- Experience in a 24x7 rotational support model.
- Undergraduate degree in a related field or equivalent combination of training and experience.
- Excellent problem-solving, documentation, and communication skills.