Senior Site Reliability Engineer, Security
okta
Job Description
You will work on:
- Building, running, and monitoring Okta's production infrastructure
- Be an evangelist for security best practices and also lead initiatives/projects to strengthen our security posture for critical infrastructure
- Responding to production incidents and determining how we can prevent them in the future
- Triaging and troubleshooting complex production issues to ensure reliability and performance
- Identifying and automating manual processes
- Continuously evolving our monitoring tools and platform
- Promoting and applying best practices for building scalable and reliable services across engineering
- Developing and maintaining technical documentation, runbooks, and procedures
- Supporting a 24x7 online environment as part of an on-call rotation
You are an ideal candidate if you:
- Are always willing to go the extra mile: see a problem, fix the problem.
- Have experience automating, securing, and running large-scale production IAM and containerized services in AWS (EC2, ECS, KMS, Kinesis, RDS), GCP (GKE, GCE) or other cloud providers.
- Have knowledge of CI/CD principles, Linux fundamentals, OS hardening, networking concepts, and IP protocols.
- Have an understanding and familiarity with configuration management tools like Chef and Terraform.
- Have experience in operational tooling languages such as Ruby, Python, Go and shell, and use of source control.
- Experience with industry-standard security tools like Nessus, Qualys, OSQuery, Splunk, etc.
- Experience with Public Key Infrastructure (PKI) and secrets management
Bonus points for:
- Experience conducting threat assessments, and assessing vulnerabilities in a high-availability setting.
- Understand MySQL, including replication and clustering strategies, and are familiar with data stores such as DynamoDB, Redis, and Elasticsearch.
Minimum Required Knowledge, Skills, Abilities, and Qualities:
- 3+ years of experience architecting and running complex AWS or other cloud networking infrastructure resources
- 3+ years of experience with Chef and Terraform
- Unflappable troubleshooting skills
- Strong Linux understanding and experience.
- Security background and knowledge.
- BS In computer science (or equivalent experience).