Staff DevOps Engineer
procore
Job Description
What you’ll do:
- Foster a strong DevOps culture emphasizing collaboration, automation, and continuous improvement. Create an environment that supports continuous learning and knowledge sharing within the team
- Provide strategic technical leadership to identify opportunities, remove obstacles, and mentor team members. Ensure commitments are clear, reasonable, and proactively monitored
- Lead the design, implementation, and maintenance of self-service cloud infrastructure to enhance development efficiency and streamline operations
- Drive continuous improvements in cloud practices, including automation, monitoring, and incident management
- Stay updated with the latest cloud technologies and industry trends
- Collaborate with cross-functional teams to design, create, and review software application architectures for streaming use cases, ensuring fault tolerance, scalability, and low-latency processing
- Optimize streaming application performance by fine-tuning configurations, monitoring resource utilization, and identifying bottlenecks. Implement best practices for data serialization, compression, and network communication.
- Evaluate and recommend tools and frameworks to enhance the performance and reliability of our streaming systems. Stay informed about industry trends related to cloud technologies.
- Partner with engineering teams to ensure solutions meet their needs. Advocate for DevOps processes and tools across all environments
- Provide guidance and mentorship to junior team members
- Collaborate cross-functionally to identify opportunities and provide support as needed
- Oversee cloud budgets and optimize expenditures to ensure cost-effective operations
What we're looking for:
- Bachelor's Degree in Computer Science or equivalent experience
- Professional experience with Java, Spring Boot, React and/or Ruby
- 8+ years of hands-on experience in software engineering fundamentals, DevOps and cloud architecture, Test Driven Development and Design principles
- Extensive experience with Kubernetes, including cluster setup, management, and troubleshooting (EKS, AKS, Kops, Openshift).
- Demonstrated ability to design and implement scalable cloud solutions.
- Deep understanding of cloud security principles and best practices.
- Proficiency with infrastructure as code (IaC) tools such as Terraform (CloudFormation, Pulumi).
- Strong scripting and automation skills using languages such as Python, Golang.
- Familiarity with CI/CD tools like ArgoCD, GitHub Actions, and CI tools (CircleCI preferred).
- Experience with SQL and NoSQL databases.
- Deep understanding and commitment to software engineering principles and processes (e.g., Lean, Agile, DevOps) and continuous improvement through measurement
- Hands-on experience with observability standards and tools (New Relic, Datadog, OpenTelemetry, Prometheus/Grafana preferred)
- Polyglot experience with other SRE tools – we integrate with more tools every day
- Experience in building modern Continuous Integration and Continuous Delivery systems at scale