Lead Data Engineer
instahyre
Job Description
- Lead the Squad: Manage, mentor, and conduct code reviews for a team of 2-3 data engineers. Drive sprint planning, estimation, and task delegation to ensure successful delivery.
- Integration Architecture: Design scalable, fault-tolerant ETL/ELT frameworks to ingest complex data from diverse sources (REST APIs, streaming logs, CRM/ERP systems) into our central repository.
- Implementation Ownership: Take full accountability for the "Implementation" phase of the software lifecycle. Ensure that architectural designs are translated into functioning, production-grade code by the team.
Engineering and Optimisation:
- Advanced Pipeline Development: Handle the most complex transformations and architectural challenges. Move beyond simple ingestion to building self-healing and idempotent pipelines.
- Performance Tuning: Write and optimise complex SQL queries and Python scripts. Identify bottlenecks in the data warehouse/lake and implement indexing, partitioning, or schema changes to improve performance.
- Code Quality Standards: Enforce version control best practices and CI/CD workflows, and run data validations within the team.
- AI/ML: Collaborate directly with data scientists and ML engineers to understand their feature requirements and build high-quality, production-ready pipelines. Engineer and manage the data infrastructure required for model training datasets, including versioning, lineage tracking, and compliance.
Reliability and Stakeholder Management:
- SLA Management and RCA: Lead the resolution of critical incidents (P0/P1). Move beyond "debugging" to performing root cause analysis (RCA) to prevent recurrence and ensure customer SLAs are met.
- Data Quality Governance: Define the strategy for monitoring and alerting. Ensure the team implements automated checks for data accuracy, freshness, and completeness.
- Collaboration: Act as the technical point of contact for product managers and architects. Translate high-level business requirements into technical tickets for your team.
Requirements:
- 5+ years of professional experience in data engineering.
- Minimum 2 years of experience leading, mentoring, or managing a small team (formal or informal).
- Must be willing to work extended hours (to overlap the US time zone).
- You will be the primary technical lead during these hours, ensuring unblocked development and rapid incident response.
- Still an active coder: You've shipped production-grade pipelines and orchestrated the flow in the last 6 months, not just managed people who did.
- Strong communication skills: You can explain complex technical decisions to non-technical stakeholders clearly.
Technical Competencies:
- Database Mastery: Expert-level proficiency in SQL (PostgreSQL, ClickHouse, and MySQL) and good experience with data warehousing modelling (star/snowflake schemas and SCDs).
- Code Proficiency: Good programming skills in Python (Pandas, PySpark, async libraries).
- Orchestration and Integration: Hands-on experience with modern data stack tools is mandatory (e. g., Airflow, NiFi, etc. )
- Cloud Native: Proven experience implementing pipelines on hyperscalers (AWS, Azure, or GCP) using services like S3 Lambda/Functions, EMR, or Redshift/Synapse.
Soft Skills:
- Delivery Focused: A mindset geared towards "getting things done in the right and optimal manner". Your focus would be on shipping working code and enabling the team to deliver without compromising on the quality of deliverables.
- Communication: Ability to explain complex technical issues to non-technical stakeholders during US business hours.
Nice to Have:
- Experience with Infrastructure as Code (Terraform, CloudFormation).
- Experience implementing data quality tools.
- Knowledge of containerisation (Docker, Kubernetes) for deploying data apps.
Why Explore a Career at Terrantic
At Terrantic, we believe diverse teams build better products. We're committed to creating an inclusive environment and are proud to be an equal opportunity employer. We value diverse perspectives and believe the best teams are built by people with different backgrounds and experiences.
We are a remote-first company and our employees love it. As an early-stage start-up, the remote-first approach allows our employees to do their best work from a place of their choice.<