Data Platform & Streaming Engineer

fluor

Vadodara, IN-GJ, India 5 Years Exp Posted 71d ago

Job Description

Key Responsibilities

Design and implement scalable data platform components (lake/lakehouse, data marts, event streams) to support AI/ML and analytics use cases.
Build and maintain real-time and near-real-time streaming pipelines using tools such as Kafka / Azure Event Hubs, Spark Structured Streaming / Flink, and stream processing patterns.
Develop robust batch ingestion and transformation pipelines (ETL/ELT) using Spark, SQL, and orchestration frameworks from SAP, Engineering systems, SuccessFactors and other enterprise systems.
Implement data modeling standards (dimensional, Data Vault, medallion architecture) suitable for analytics and ML feature readiness.
Ensure end-to-end data quality through validation rules, anomaly checks, schema evolution strategies, and automated testing.
Operationalize pipelines with CI/CD, infrastructure-as-code, version control, and environment promotion standards.
Establish observability (logging, metrics, tracing), SLOs, and incident response playbooks for data/streaming services.
Apply data governance controls: lineage, cataloging, retention, access policies, encryption, and privacy-by-design.
Optimize performance and cost across compute/storage by tuning jobs, partitioning strategies, caching, and streaming backpressure handling.
Collaborate with AI/ML engineers to enable feature stores, training data pipelines, and online/offline consistency patterns.
Interface with business/domain stakeholders (e.g., project controls, engineering, supply chain) to translate requirements into data products.
Document architectures, runbooks, and standards; mentor junior engineers and promote engineering excellence.

Basic Job Requirements

5+ years of experience in data engineering, including streaming and distributed processing.
Strong hands-on experience with streaming platforms (e.g., Kafka, Azure Event Hubs, Confluent, Pulsar) and patterns (event-driven architecture, CDC, exactly-once/at-least-once).
Proficiency in Spark (PySpark/Scala) and SQL; experience with Spark Structured Streaming or equivalent.
Experience building data platforms on cloud (preferably Azure): ADLS, Databricks, Synapse, Data Factory, Event Hubs, Functions & AKS
Strong software engineering fundamentals: Python/Scala/Java, APIs, data structures, reliability patterns.
Familiarity with data lakehouse concepts, file formats (Delta/Iceberg/Hudi, Parquet), and schema management.
Experience with CI/CD (Azure DevOps/GitHub Actions), Git, and IaC (Terraform/Bicep/ARM).
Understanding of security fundamentals: IAM/RBAC, secrets management, encryption, and compliance-aware data handling.

Other Job Requirements

Preferred Qualifications

Experience implementing CDC using Debezium, Kafka Connect, or cloud CDC services.
Knowledge of ML data enablement: feature engineering pipelines, feature stores, training/serving data consistency.
Experience with data governance tooling: Purview, Data Catalog, lineage/metadata management.
Exposure to containerization/orchestration (Docker, Kubernetes/AKS) for data services.
Experience with time-series/IoT or industrial data streams (e.g., sensors, telemetry), or EPC domain datasets.
Familiarity with test automation for data pipelines (Great Expectations, Deequ, custom frameworks) and data contract testing.
Preferred (optional): Azure Data Engineer Associate, Databricks certifications, Kafka/Confluent certifications.
- Proven experience supporting real-time streaming workloads and platform reliability in enterprise environments.

Data Platform & Streaming Engineer

Job Description

Basic Job Requirements

Other Job Requirements

Preferred Qualifications

Similar Openings for You

Data Engineer

AI Data Foundation Engineer

Senior Data Engineer- Spark, Abinitio, Python, SQL, Data warehouse

Senior Software Engineer- Data Engineering