Principal Data Engineer

greenhouse

Bangalore 10 Years Exp Posted 85d ago

Data Architecture & Engineering

Architect, develop, and optimize complex data pipelines, data models, and ELT/ETL workflows across Synapse, Databricks, SQL Server, and cloud-native services.

Lead large-scale data integration and modernization projects, expanding on the team’s responsibilities for analyzing requirements, creating specifications, and managing delivery schedules.

Define and enforce best practices for data engineering, performance optimization, data quality, reliability, and observability.

Cross-Functional Collaboration

Advise senior leaders and product owners on data strategy, platform capabilities, and architectural tradeoffs.

Partner with Security, Infrastructure, and business teams to enable high‑quality, trusted, and governed data products.

Advanced Analytics & BI Enablement

Design and optimize highly complex, high‑volume, and low‑latency data pipelines supporting analytics, operational workloads, and ML/AI use cases.

Oversee implementation of robust data quality, lineage, cataloging, and metadata capabilities in partnership with Data Governance.

Cloud, Platform & Tooling Expertise

Serve as the technical authority on Azure Synapse, Databricks, Spark, SQL Server, and cloud-native data services.

Lead platform enhancements for performance, cost optimization, reliability, and global scale.

Mentorship & Technical Stewardship

Coach and mentor Senior and Mid‑Level Data Engineers, elevating engineering capability across the organization.
Lead technical design reviews, provide architectural guidance, and cultivate a culture of engineering excellence.
Champion innovation through research, POCs, and evaluation of emerging technologies.

Who you are?

10+ years of data engineering, data architecture, or analytics engineering experience.
Proven experience as a lead or principal-level engineer designing large-scale, distributed data systems.
Expert-level proficiency in:
- SQL, performance tuning, query optimization
- Azure Synapse, Databricks, Spark/Delta Lake
- ETL/ELT pipelines, orchestration frameworks
- Data modeling, OLTP/OLAP, dimensional and semantic modeling
  - Cloud architecture patterns and distributed computing