Data Engineer
join
Job Description
Key Responsibilities
Data Integration & Consolidation
- Migrate and unify on-premises and cloud databases into Azure (e.g., Azure SQL, Azure Storage, Synapse/Microsoft Fabric OneLake).
- Design landing, curated, and serving layers aligned to business domains.
Pipeline Engineering (ETL/ELT)
- Build and manage pipelines using Azure Data Factory and Microsoft Fabric Data Pipelines/Notebooks; use Synapse/Spark for ingestion, transformation, and orchestration.
- Implement CI/CD for data workflows; monitor performance, reliability, and cost.
Data Modeling & Architecture
- Develop logical and physical data models (star/snowflake; data vault where appropriate).
- Establish standards for schema design, partitioning, and optimization.
Secure Data Exposure
- Create well-defined interfaces (views, APIs, lakehouse tables, Semantic Models) for downstream BI/analytics and operational systems.
- Implement data quality checks, lineage, and observability.
Governance & Operations
- Apply RBAC governance across Azure and Fabric resources to enforce least-privilege access.
- Manage metadata/catalog (e.g., Microsoft Purview), tagging, and policies.
- Provide operational support during early US hours in addition to Indian business hours.
Requirements
Required Skills & Experience (for complex tasks) – Overall 2.5 - 5 years
- Microsoft Azure Data Services (Azure Data Factory, Azure SQL, Synapse/Fabric, Storage):
- Microsoft Fabric (Data Pipelines, Lakehouse, Notebooks, OneLake, Semantic Models):
- ETL/ELT & Orchestration (ADF pipelines, Fabric pipelines, notebooks, scheduling):
- SQL Engineering & Performance Tuning (T-SQL, query optimization, indexing):
- Data Modeling & Architecture (dimensional, 3NF, data vault; schema design):
- Python or Scala for Data Engineering (Spark, notebooks):
- DevOps for Data (Git, CI/CD for pipelines/notebooks; IaC with Bicep/Terraform):
- RBAC Governance familiarity (Azure RBAC, Fabric workspace/item permissions, least privilege):
- Data Quality & Observability (DQ checks, monitoring, alerting):
Nice to Have
- Microsoft Purview for catalog, lineage, and policy management
- Power BI (Semantic Models in Fabric, Direct Lake, gateway configuration)
- Event-driven ingestion (Kafka/Event Hub/IoT)
- Cost optimization for Fabric/Synapse/Spark workloads