Lead Data Engineer
sproutsai
Job Description
We're looking for a seasoned Lead Data Engineer to own and drive our cloud-native
data platform development end-to-end. This is a high-impact, hands-on leadership role
where you'll architect scalable data and database systems, ship production-grade
pipelines, and guide a growing team — all while keeping a sharp eye on business
outcomes.
You'll tackle engineering challenges across distributed systems, large-scale databases,
and multi-cloud data infrastructure. If you thrive at the intersection of deep systems-level
work and cross-functional collaboration, this role is for you.
What You'll Do
• Architect & Build: Design, implement, and maintain scalable, production-grade
data platforms across multi-cloud, multi-tenant environments (AWS, Azure,
GCP). Build database and storage solutions that work seamlessly across cloud
providers and diverse deployment models.
• Scale Database Systems: Own the design and operation of database
infrastructure supporting a large number of tables, high-throughput operations,
and complex query workloads — scaling through 100x+ growth while maintaining
reliability and performance.
• Lead Delivery: Own project timelines, priorities, and stakeholder communication.
Drive data engineering initiatives from ideation through production with a bias for
outcomes over activity.
• Set Technical Direction: Define data architecture standards, tooling choices,
and engineering best practices. Make critical build vs. buy decisions for data and
database technologies.
• Mentor & Grow the Team: Provide technical mentorship, conduct code reviews,
and help shape a high-performing data engineering culture.
• Collaborate Cross-Functionally: Partner closely with product, analytics, ML/AI,
platform, and infrastructure teams to ensure data systems power real business
value.
• Operate with Ownership: Monitor data quality, pipeline reliability, and platform
health. Own what you build from design through decommission. Treat production
like a product.
What You Bring Required
• 6+ years of hands-on experience in cloud-native data engineering, spanning
ingestion, transformation, orchestration, storage, governance, and observability.
• Deep expertise in modern distributed systems — you understand consensus,
partitioning, replication, fault tolerance, and have built or operated distributed
data infrastructure at scale.
• Scalable database architecture — proven experience designing and managing
database systems with a large number of tables, high-volume OLTP/OLAP
workloads, and complex operational patterns. You've scaled databases through
massive growth at high-growth companies.
• 1+ years of project management experience — you've owned roadmaps,
managed delivery timelines, coordinated across teams, and are comfortable with
tools like Jira.
• Deep expertise in scalable, multi-cloud, multi-tenant data architecture — you
understand the trade-offs and have built systems that serve diverse workloads
across GCP, AWS, Azure, first-party and third-party deployment models.
• Strong proficiency in modern data stack technologies such as Spark, Kafka,
Airflow/Dagster, dbt, Snowflake, Databricks, Delta Lake/Iceberg, or
equivalent.
• Deep experience with distributed database systems — PostgreSQL, MySQL,
DynamoDB, or similar — including performance tuning, schema design at scale,
and operational reliability.
• Proficiency in Python, SQL, and Java/Scala, and at least one infrastructure-as-
code framework (Terraform, Pulumi, etc.).
• Experience with data quality, data profiling, data integration, and data
governance — you can engineer solutions that ensure secure and consistent
data consumption across platforms.
• A production-first, outcome-oriented mindset — you measure success by
what's running reliably in production, not by what's in a slide deck. Customer
value over story-point velocity.
• Excellent communication skills — you can translate complex technical concepts
for both engineering peers and business stakeholders.
Preferred
• 1+ years of tech/data team management experience — you've directly
managed engineers, run standups, h