Mainframe - SRE
FIS
Job Description
What You Will Be Doing
- Design and maintain monitoring and alerting solutions for infrastructure, application performance, and user experience.
- Implement automation tools and processes for routine tasks, scalable infrastructure, and seamless deployments.
- Ensure reliability, availability, and performance of applications and services, minimizing downtime and optimizing response times.
- Lead incident response, including identification, triage, resolution, and post-incident analysis.
- Conduct capacity planning, performance tuning, and resource optimization in collaboration with development and operations.
- Collaborate with security teams to implement best practices, perform vulnerability assessments, and ensure compliance.
- Manage deployment pipelines, release processes, and configuration management for consistent, reliable deployments.
- Identify and drive improvements in reliability, performance, and efficiency through data and root cause analysis.
- Create and maintain documentation, runbooks, and knowledge base articles, promoting knowledge sharing.
- Develop and test disaster recovery plans, backup strategies, and failover mechanisms.
- Collaborate with development, QA, DevOps, and product teams to align on reliability goals and incident response.
- Participate in on-call rotations, providing 24/7 support for critical incidents and coordinating resolution and follow-up.
What you Bring
- 4+ years of hands-on experience in software development roles using Unisys COBOL, SQL, Embedded SQL, COBOL, JCL, CL, DDS, DDL, BMS, CICS, JES, HTML,
- 3+ years working on large-scale, client-facing, enterprise production software.
- Strong English communication and collaboration skills.
- Proficiency in modern development architectures (web, API), cloud platforms (AWS, Azure, Google Cloud), and infrastructure as code (Terraform, Ansible).
- Experience with monitoring and logging tools (Prometheus, Grafana, DataDog, New Relic, Splunk, SumoLogic, ELK Stack), including dashboards and alerts.
- Skilled in incident management (response, triage, RCA, post-mortem) and troubleshooting complex technical issues.
- Proficiency in scripting languages (Python, Bash) and automation tools.
- Experience with CI/CD pipelines (Jenkins, GitLab CI/CD, Azure DevOps).
- Familiarity with Application Performance Monitoring (APM) and Real User Monitoring (RUM) tools.
- Commitment to continuous learning, adaptability, and operational excellence.
- Proficiency in multiple programming languages: C#, Java, Python, JavaScript, Visual Basic, SQL+, Oracle PLSQL.
- Experience building FinTech, payment, or banking systems, including API design and third-party integration.
- Familiarity with Agile environments, especially with bi-monthly production releases.
- Knowledge of FIS products/services and the broader Financial Services Industry.
- Experience with development tools: V7.4, Eclipse, Visual Studio, Azure DevOps, MDCMS, Git, Microsoft Office (Visio, RDi, X Analysis, Hawkeye, CheckMarx).
- Understanding of OS/400 and Windows 11 operating systems.
- Demonstrates judgment, flexibility, and a solutions-oriented mindset.
- Takes ownership of engineering and product outcomes.
- Action-oriented self-starter with strong execution skills.
- Excellent interpersonal, negotiation, and influencing skills.
- Penchant for excellence, intellectual curiosity, and continuous improvement.
- Quickly establishes credibility with colleagues and partners.
- Embodies and delivers the firm's values: Win as one team, Lead with integrity, Be the change.