Senior Data Platform Engineer
ashbyhq
Job Description
-
Big Data Platform & Infrastructure
-
Design, build, and operate large-scale data processing infrastructure using Spark on Databricks — ensuring reliability, performance, and cost efficiency at scale.
-
Architect and maintain lakehouse solutions (Delta Lake, Iceberg) including partitioning strategies, Z-ordering, and compaction jobs.
-
Own cluster management, autoscaling policies, and resource governance across Databricks workspaces.
-
Drive platform-level improvements: query optimisation, caching strategies, compute–storage separation, and shuffle tuning.
-
ETL / ELT Pipeline Engineering
-
Design and build robust, idempotent, and testable data pipelines handling batch and near-real-time workloads.
-
Manage and extend our Airflow-based orchestration layer — DAG authoring standards, dependency management, alerting, and SLA enforcement.
-
Implement and maintain CDC pipelines (Debezium, Kafka Connect, or native DB replication) ensuring low-latency, high-fidelity data propagation.
-
Define data pipeline contracts (schemas, SLAs, quality assertions) and enforce them via automated data quality frameworks.
-
Analytical Storage & Computation
-
Model and manage analytical data stores — dimensional models, OBT patterns, and aggregation layers optimised for BI and self-serve analytics.
-
Own the evolution of our analytical warehouse/lakehouse stack — performance benchmarking, cost modelling, and technology selection.
-
Build and maintain efficient data serving layers for dashboards, ML feature stores, and reverse ETL use cases.
-
Implement data retention, archival, and lifecycle management policies across hot/warm/cold storage tiers.
-
Platform Engineering & Developer Experience
-
Define and enforce data platform engineering best practices — code standards, CI/CD for pipelines, automated testing, and observability.
-
Build internal tooling and libraries that make data engineers faster: reusable Spark utilities, pipeline templates, local dev environments.
-
Champion data reliability engineering: lineage tracking, incident response playbooks, pipeline SLO monitoring, and root cause analysis.
-