Senior AI Data Engineer
hpe
Job Description
Technical Leadership & Data Architecture
-
Serve as the data engineering SME — the escalation point for complex platforms, pipeline, and governance decisions across the team and organization.
-
Architect end-to-end data solutions: Design reusable data components, design Lakehouse structures, data vault patterns, semantic layers, and integration of architectures across AI, Analytics and Automation platforms.
-
Define and enforce data engineering standards, pipeline design patterns, naming conventions, and coding best practices; conduct architecture and code reviews.
-
Lead technical discovery for new data initiatives: assess feasibility, design solution approaches, and produce architecture documentation for stakeholder alignment.
-
Mentor and upskill the Technical Data Engineer through structured knowledge transfer, pair programming, and design reviews.
Advanced Data Transformation & Pipeline Engineering
-
Design and deliver complex, production-grade ELT/ETL pipelines using Databricks (Delta Live Tables, PySpark, Unity Catalog) and Microsoft Fabric (Dataflows Gen2, Notebooks, Data Factory).
-
Architect reusable, parameterized pipeline frameworks that the wider team can adopt — reducing one-off scripting and increasing delivery velocity.
-
Define and implement advanced transformation patterns: multi-hop Delta Lake pipelines, SCD Type 2/6, event-driven streaming ingestion, and late-arriving data handling.
-
Optimize pipeline performance at scale — partitioning strategy, Z-ordering, liquid clustering, broadcast joins, and cost-based query planning in Spark.
Data Quality Strategy & Governance Leadership
-
Define the AI products data quality framework — establish DQ dimensions, thresholds, escalation paths, and remediation SLAs across all critical datasets.
-
Drive business glossary completeness, lineage documentation, data stewardship workflow design, and policy management.
-
Implement automated data quality validation at scale. Integrate DQ gates into CI/CD pipeline deployments.
-
Act as the data quality escalation authority — triage complex DQ incidents, perform root-cause analysis, and drive permanent fixes rather than tactical workarounds.
Power BI & Advanced Analytics Reporting
-
Lead the Power BI governance model: certified dataset strategy, endorsement policies, deployment pipelines, and report lifecycle management.
-
Design and review complex semantic models for performance, correctness, and scalability; define DAX patterns and measure libraries for team-wide reuse.
-
Translate complex analytical requirements from senior stakeholders into governed, self-service-ready data products.
-
Drive adoption of Copilot-assisted authoring in Power BI and Fabric; evaluate AI-assisted analytics features and recommend adoption roadmaps.
-
AI Data Enablement & Cross-functional Partnership
-
Partner with the Senior AI & ML Engineer to architect data foundations for LLM and ML use cases — defining feature stores, embedding pipelines, and vector-ready data products.
-
Ensure Gold layer datasets are structured, documented, and versioned in ways that make them immediately consumable by agentic AI and RAG pipelines.
-
Participate in cross-functional AI and data initiatives at a strategic level — representing the data engineering perspective in architecture forums and leadership reviews.
-
Define and lead master data management (MDM) initiatives to establish single sources of truth for key business entities.
Innovation, Standards & Thought Leadership
-
Maintain a technology radar for the data engineering practice; evaluate emerging tools and conduct experimentation and pilots to assess the capabilities
-
Present data architecture proposals and governance roadmaps to senior stakeholders with clarity and business context.
-
Foster a culture of data quality ownership across the organization — run knowledge sessions, author internal engineering guides, and build data literacy
-