CloudOps Engineer

tieto

Bangalore 5 Years Exp Posted 1h ago

Job Description

  • Be a champion for department initiatives and values by ensuring all actions promote the department’s mission statement
  • Work Closely with AWS Professional Services team in the AWS migration project to migrate the Client products from other hosting platforms into AWS Cloud.
  • Work closely with the DevOps engineers in setting up the controlled environments of Client’s products built on the AWS Cloud using IaC.
  • Participate in release cycles of product by closely working with Engineering Managers, Architects and Developers during the AWS migration.
  • Create and maintain standard operating procedures (SOPs) for performing maintenance tasks, applying configuration changes, and remediate problems in the environment.
  • Implement monitoringalertingnotification and metrics collection for
    • Infrastructure and application performance
    • System uptime
    • Error rate
  • Monitor and continually improve the capacity and reliability of our production environment infrastructure.
  • Investigate and fix performance and scalability bottlenecks, proactively identify issues and create work items to improve stability and performance.
  • Identify single points of failure and other high-risk architecture issues and propose resilient resolutions to mitigate the risk thereby improving the system reliability.
  • See opportunities for automation and reduce the operational workload, build scripts, introduce new tools and practices as needed
  • Work with other Cloud Infrastructure engineers and developers to ensure maximum performance, reliability and automation of our deployments and infrastructure.
  • Communicate to stakeholders and handle the deployment/maintenance/support efficiently
  • Ticket Handling and Support
    • Tickets that are handled should have clear communication and correct stakeholders involved
    • Tickets should be completed within the SLA and should be clearly informed, documented if there are any delays or improper tickets.
    • Tickets should have proper comments to close the ticket including steps for resolutions, screen shots.
    • Tickets that are repetitive should be discussed in standup call for brainstorming and eventually should lead into resolution through automation if necessary.

 

Skills Required:

  • 5+ years of experience with any public cloud provider such as Amazon Web Services (AWS) and On-Prem Servers
  • Hands-on experience on migrating workloads from On-prem or other cloud hosting platforms into AWS cloud.
  • Solid understanding of standard TCP/IP networking, Load Balancing and common protocols like DNS, HTTPS
  • Good knowledge on CI/CD tools like Azure ADO, GitHub Actions, Jenkins etc
  • Monitoring and Logging: Experience with any Application monitoring and logging tools (e.g. Datadog, New Relic, AppDynamics, Application Insight, ELK, Prometheus).
  • Incident Management experience (from tools like PagerDuty)
  • Good understanding of Web Servers & Database
  • Good understanding in Docker and Kubernetes.
  • Good scripting knowledge & Software life cycles model.
  • Good understanding of DevOps practices.
  • Should have worked on high traffic & highly scalable systems in past
  • Knowledge of fundamental aspects for release automation (packaging, dependencies, promotion, deployment, compliance)
  • A passion for collecting, evaluating, and improving performance metrics.
  • Excellent time management, resource organization and priority establishment skills, and ability to multi-task in a fast-paced environment
  • Ability to work quickly and efficiently with minimal supervision
  • Excellent communication skills with both written and verbal

 

Qualifications:

  • 5+ years of Systems Engineering experience in the following areas
    • Cloud platforms (Azure, AWS) and On-Prem Servers
    • Windows and Linux Servers
    • Application Monitoring Tools (Datadog, New Relic, AppDynamics, Application Insights)
    • Log Aggregation Tools (Datadog, ELK, etc)
    • PowerShell, Bash, or Python scripting
    • CI/CD tools (Azure Pipelines, GithHub Actions, Jenkins, Octopus, etc.)
    • Infrastructure management tools (Terraform, Ansible, etc.)
    • Application Hosting (IIS,