Problem Management (Linux Engineer with Cloud)
sap
Job Description
- Linux Expertise:
- Design, implement, and maintain complex Linux-based systems.
- Troubleshoot and resolve critical system issues with a focus on rapid recovery.
- Optimize system performance and ensure high availability.
- Implement and maintain security best practices, including OS hardening and compliance.
- Automate routine tasks through scripting and configuration management tools.
- Cloud Administration:
- Manage and maintain cloud-based Linux infrastructure (AWS, Azure, GCP).
- Deploy and configure cloud resources like virtual machines, storage, and networking components.
- Optimize cloud costs and implement resource scaling strategies.
- Ensure the security and compliance of cloud-based systems.
- Root Cause Analysis:
- Conduct thorough Root Cause Analysis (RCA) to identify the underlying causes of system failures or performance issues.
- Collaborate with cross-functional teams to gather data, replicate issues, and implement permanent solutions.
- Create comprehensive RCA reports, system documentation, and knowledge base articles to prevent future occurrences.
- Mentorship and Collaboration:
- Provide guidance and support to junior team members.
- Actively participate in knowledge sharing and contribute to team documentation.
- Stay current with industry trends, technologies, and best practices.
What you'll bring
- Experience: 10+ years of professional experience in Linux system administration with a demonstrated ability to perform Root Cause Analysis