Data Engineer – Specialist

carrier

Hyderabad 6 Years Exp Posted 79d ago

Key Responsibilities:

1) Data Pipeline & Lakehouse Engineering

Design and implement robust, reusable data pipelines for batch and streaming use cases using AWS-native services (e.g., S3, Glue, Kinesis) and orchestration tools (e.g., Airflow) where applicable.
Build and standardize medallion-layered ingestion and transformation patterns (raw → silver → gold) as a repeatable engineering approach.
Develop and optimize Iceberg (or similar open table format) datasets with strong practices for schema evolution, partitioning, and performance for multi-engine consumption.

2) Open Standards, Interoperability & Cloud-Agnostic Delivery

Apply open standards to reduce lock-in by designing storage and metadata layers that work across engines (e.g., Athena/Trino/EMR/Redshift/Snowflake/Databricks depending on enterprise choices).
Contribute to enterprise adoption of Apache Iceberg as the open table standard for interoperability and portability across environments.
Implement standardized interfaces for pipelines and data products (e.g., config-driven patterns) to support portability and consistent operations.

3) Data Quality, Governance, Metadata & Lineage

Embed automated quality checks, data validation, and pipeline test coverage, ensuring trusted datasets for analytics and AI/ML.
Emit lineage/metadata signals by instrumenting pipelines to produce OpenLineage events (or equivalent enterprise lineage standards) and register assets in the enterprise catalog as required.
Ensure consistent ownership, documentation, and discoverability for produced datasets/data products.

4) Operational Excellence (DataOps/DevOps)

Champion CI/CD for data pipelines and infrastructure changes, including automated checks and safe promotion across environments.
Implement observability (metrics, logs, alerts) and contribute to incident triage and reliability improvements for production pipelines.
Partner with Security/Platform teams on IAM least privilege, access controls, and governed data access patterns.

5) Technical Leadership & Collaboration

Work closely with platform engineers, data product owners, governance teams, and downstream consumers to deliver curated datasets and reusable platform capabilities.
Mentor junior engineers and help define internal standards, frameworks, and best practices for lakehouse engineering.

Required Qualifications

6 to 10 years years of experience in data engineering or related roles.
Proficiency in Python and SQL.
Strong understanding of batch and streaming data processing.
- Experience delivering production‑ready data products.