Senior Engineer, Cloud Infrastructure

cvent

Gurugram 3 Years Exp Posted 23d ago

Job Description

Cloud Infrastructure Engineering (AWS)

  • Design, implement, and operate highly available, secure, and scalable AWS infrastructure (e.g., VPC, Transit Gateway, EC2, Load Balancing, S3, EBS/EFS/FSx, Route 53, IAM, KMS, Backup).
  • Build and maintain infrastructure-as-code using tools such as AWS CDK / CloudFormation, enforcing standards, guardrails, and reusable patterns.
  • Develop automation and tooling (primarily in Python/TypeScript) to remove repetitive operational work (provisioning, patching, configuration, cleanup, compliance checks, reporting).
  • Contribute to and sometimes lead design reviews, architecture discussions, and RFCs for new or evolving infrastructure services.
  • Partner with Security and Compliance to meet security, audit, and regulatory requirements across accounts and regions.

AI Agents, Orchestration & Multi‑Agent Systems

  • Identify high‑value Cloud Infra workflows (e.g., incident triage, change impact analysis, runbook execution, capacity/cost recommendations) that can be automated using AI agents.
  • Design and implement agentic workflows (single and multi‑agent) using modern AI orchestration patterns and frameworks (e.g., tool‑calling, planners, evaluators, guardrails).
  • Integrate agents with existing cloud APIs, observability tools, ticketing systems, and runbooks to provide end‑to‑end, human-in-the-loop automation.
  • Define and enforce safety, security, and approval guardrails for AI‑driven actions (RBAC, policy checks, dry‑runs, explicit approvals, audit logging).
  • Measure and communicate impact of AI automation (MTTR reduction, hours saved, error reduction, cost optimization, improved engineer experience).

Reliability, Operations & On‑Call

  • Own the reliability and performance of services you build – from design through deployment and production operations.
  • Implement and tune monitoring, logging, alerting, and SLO/SLA dashboards for Cloud Infra services (Datadog/Splunk/CloudWatch or similar).
  • Participate in the on‑call rotation, lead troubleshooting for complex AWS infrastructure incidents, and drive post‑incident reviews and preventative improvements.
  • Proactively identify technical debt and reliability risks in infrastructure and drive remediation plans.

Collaboration, Mentoring & Best Practices

  • Act as a technical mentor to Engineer I/II teammates on AWS fundamentals, automation patterns, and AI‑driven operations.
  • Help define and evolve paved road standards for AWS infrastructure, automation, and AI agent usage across Cloud Infrastructure.
  • Contribute to runbooks, design docs, knowledge base articles, and internal training sessions, including AI and automation best practices


Here's What You Need:
 

Required Qualifications

  • 3–6 years of hands-on‑ experience in Cloud / Infrastructure Engineering, or similar roles, with strong focus on AWS.
  • Deep understanding of core AWS services: VPC & networking (subnets, routing, TGW, VPN/Direct Connect, security groups, NACLs), EC2, Auto Scaling, Load Balancing, S3, EBS/EFS/FSx, Route 53, IAM, KMS, CloudWatch/CloudTrail, and Backup.
  • Strong experience with Infrastructure-as-Code (AWS CDK, CloudFormation, or Terraform) and Git-based workflows (branching, PR reviews, CI/CD).
  • Solid programming skills in at least one language commonly used for infra-automation, such as Python or TypeScript/Node.js.
  • Proven track record designing and operating production-grade, multi‑account/multi‑region AWS environments with a focus on security, reliability, and cost.
  • Experience implementing observability for infrastructure services (metrics, logs, traces, alerting, dashboards).
  • Demonstrated ability to own complex projects end-to-end: requirements, design, implementation, rollout, and post‑launc

Similar Openings for You