SRE Engineer
IBM
Job Description
- Manage and maintain Linux-based systems across multiple environments.
- Automate provisioning, configuration, and deployment tasks using tools like Ansible and Jenkins
- Design, implement, and manage deployment of containerized applications using Kubernetes and docker.
- Monitor and troubleshoot system performance, network issues, and applications to ensure optimal uptime and efficiency.
- Harden the server from scratch using baseboard management controller (BMC)s.
- Implement and maintain security best practices, ensuring compliance with company policies.
- Proactively identify potential improvements to processes and systems.
- Analyze and fix network & DNS issues in the environment.
- Upgrade Kubernetes worker nodes and packages without interrupting the cluster.
- Maintain benchmarking standards on systems to ensure continuous compliance.
- Participate in on-call rotation to support critical infrastructure issues.