Development and Operations Engineer Lead
trimble
Job Description
Key Responsibilities
-
Provisioning and maintaining infrastructure in AZURE/AWS cloud.
-
Maintain and improve the current CICD pipeline (Github workflow & Jenkins)
-
Strong Python Knowledge.
-
Proficiency in using Terraform for infrastructure as code to manage AWS/Azure resources
-
Shell/Powershell scripting
-
Application deployment automation using Ansible/Terraform/etc
-
Review and maintain IAC repositories in bitbucket
-
Fix the AZURE/AWS non-compliance wherever possible - Security controls and best practices
-
Strong emphasis on DevOps as an engineering discipline with a focus on automation
-
Handle escalations from internal stakeholders and manage critical issues to resolution
-
Manage, and provide guidance to a high performing global team of Site Reliability Engineers.
-
Teaching how to adopt reliability engineering practices such as error budgets, blameless retrospectives, chaos engineering, etc.
-
Identify problems and opportunities for improvements that are common across many teams and services.
-
Develop services to handle automatic recovery from incidents and disasters.
-
Participate in troubleshooting, capacity analysis and planning, and performance analysis
-
Design cost controls and rollout the cost optimization strategy
-
Respond on-call to incidents with quick and effective resolutions
-
Responsible for fixing compliance issues and requirements raised by SecOps tools
Required Skills and Experience
-
Minimum 7+ years experience in technical and people management.
-
History of supporting applications and infrastructure in Production
-
Experience in Capacity planning and Cost optimization
-
Deep understanding of Linux/Unix operating systems
-
Experience using a high-level scripting language (Python preferred) and IaC tools(Terraform, CloudFormation)
-
Infrastructure as code (IaC) and System Administration skills
-
Software Development and Continuous Integration skills
-
Experience with AZURE/AWS cloud services
-
Ability to troubleshoot and resolve infrastructure issues
-
Excellent problem-solving and analytical skills
-
Bachelor's degree in Computer Science, Engineering, or related field
Desirable Skills and Experience
-
AZURE/AWS Certification (or equivalent in another public cloud)
-
Experience with microservice architecture
-
Above-average skills in Python or another high-level programming language
-
Experience with SaaS monitoring toolsets (Datadog, SumoLogic, PagerDuty, ELK, Grafana)
-
Experience in Atlassian tools: Bitbucket, Jira, and Confluence,Github
-
Experience using SQL and NoSQL databases
-
Experience with Jenkins/Bamboo for CI/CD
-
Experience in Kubernetes is an added advantage
-
Extensive experience with Azure App Service, Azure Functions, and other Azure services.
-
Proven experience with Azure Front Door and designing multi-region architectures for high availability and disaster recovery.
-
Strong knowledge of GitHub workflows for CI/CD pipeline implementation and management.