AWS-Cloud Platform Ops Engineer
ey
Job Description
Your key responsibilities
- AWS Cloud Platform Support and Administration
- Provide day-to-day support and administration for AWS cloud environments, ensuring high availability and performance of applications and services.
- Troubleshoot and resolve issues related to AWS-hosted applications and infrastructure, minimizing downtime and impact on users.
- Act as a technical escalation point for application developers and support teams encountering complex AWS-related problems.
- Monitor AWS services for health, performance, and security compliance
- AWS Services Deployment and Management
- Design, build, and deploy scalable and secure AWS services based on business requirements and cloud best practices.
- Manage AWS infrastructure components including EC2 instances, S3 buckets, RDS databases, Lambda functions, VPCs, and networking configurations.
- Automate deployment and management processes using Infrastructure as Code (IaC) techniques to improve consistency and reduce manual effort.
- Infrastructure as Code and Automation
- Develop, maintain, and optimize AWS CloudFormation templates to provision and update cloud infrastructure reliably.
- Write and maintain AWS Lambda functions to automate operational tasks, integrate services, and enhance system capabilities.
- Collaborate with development, operations, and security teams to integrate automation tools within the CI/CD pipeline and overall AWS ecosystem.
- Incident and Maintenance Management
- Monitor and respond promptly to maintenance notifications and service advisories from AWS to prepare for and mitigate potential disruptions.
- Coordinate with AWS support and internal teams to address escalations quickly and effectively.
- Document and communicate operational processes, incidents, and resolutions to relevant stakeholders for continuous improvement.
- Problem Management:
- Collaborate closely with application (App) development teams to identify, diagnose, and resolve operational problems impacting the platform or applications.
- Act as a bridge between platform operations and development teams to ensure timely root cause analysis and implement permanent fixes.
- Collaboration and Continuous Improvement
- Work closely with application developers, DevOps, and IT teams to provide guidance, support, and best practices for AWS usage.
- Participate in cloud architecture reviews and recommend improvements to optimize cost, performance, and security.
- Stay updated with the latest AWS services, features, and industry trends to continuously enhance cloud platform capabilities.
Skills and attributes for success
- AWS Cloud Practitioner Certification (mandatory); AWS Solutions Architect Certification preferred.
- Minimum of 3 years hands-on experience administering and supporting AWS cloud platforms; 5 years preferred.
- Strong experience with AWS services including EC2, S3, RDS, Lambda, CloudFormation, CloudWatch, and IAM.
- Proficiency in scripting and automation using AWS Lambda (Python, NodeJS) and CloudFormation.
- Exposure to DevSecOps CI/CD pipeline implementation using GitHub Actions.
- Excellent problem-solving skills with the ability collaborating with application teams, ability to handle escalations and troubleshoot complex cloud issues.
- Experience in incident management, response to AWS maintenance notifications, and stakeholder communication.
- Proficiency in observability tools, SRE practices, and automation to enhance system reliability and reduce manual toil.
- Strong collaboration and communication skills to work effectively across teams