Data Engineer

ennvee

Bengaluru, India 4 Years Exp Posted 31d ago

Job Description

Essential Responsibilities

  • Build, enhance, and maintain production data pipelines and datasets on a modern cloud data platform (Databricks or Snowflake), with an emphasis on stability, reliability, and continuous improvement.
  • Develop efficient ingestion, transformation, and curation workflows using industry-standard patterns such as the medallion architecture (bronze / silver / gold) or an equivalent layered design.
  • Design and implement dimensional and analytical data models (Kimball star schema, Data Vault, or equivalent) that support reporting, self-service analytics, and downstream AI/ML workloads.
  • Troubleshoot and resolve data pipeline, data quality, and platform issues promptly, with clear root-cause analysis and durable fixes.
  • Partner with stakeholders across the organization to understand data needs, translate requirements into technical designs, and set clear expectations on scope and delivery.
  • Contribute to data security and governance — including access controls, PII handling, row-level security, masking, and usage logging — using tools such as Unity Catalog, Snowflake Horizon, or equivalent.
  • Implement data quality checks and observability (expectations, tests, monitoring, alerting) to ensure trustworthy datasets for downstream consumers.
  • Support analysts and report builders with dataset design, documentation, and best practices for modern BI tools (Power BI, Tableau, Looker, or similar).
  • Participate in code reviews, CI/CD deployments, and change management; own the quality of your releases to production.
  • Stay current with platform features and recommend adoption of new capabilities where they drive measurable value.

Required Qualifications

  • 3–5 years of hands-on experience in a data engineering or closely related technical role.
  • Production experience delivering solutions on a modern cloud data platform — Databricks or Snowflake (Databricks strongly preferred).
  • Strong proficiency in SQL and Python, including writing performant, well-tested, production-grade code.
  • Hands-on experience building ETL/ELT pipelines — ingestion, transformation, cleansing, and curation — against large, complex datasets.
  • Working knowledge of data modeling techniques (Kimball / dimensional modeling, Data Vault, or medallion architecture) and when to apply each.
  • Experience with workflow orchestration tools such as Apache Airflow, Azure Data Factory, Databricks Workflows, dbt, or equivalent.
  • Experience integrating with enterprise source systems — ERPs (e.g., SAP, Oracle, Dynamics, Workday), CRMs, APIs, and relational databases.
  • Hands-on experience with at least one major cloud provider (Azure, AWS, or GCP); Azure preferred.
  • Experience with Git-based version control and CI/CD for data pipelines (Azure DevOps, GitHub Actions, GitLab CI, or similar).
  • Exposure to data quality and observability practices — test frameworks, expectations, lineage, monitoring, and alerting (Great Expectations, dbt tests, Monte Carlo, or similar).
  • Familiarity with Agile/Scrum delivery and collaborative development environments.
  • Bachelor’s degree in computer science, Engineering, a STEM field, or equivalent practical experience.

Preferred Qualifications

  • Production experience with Databricks Unity Catalog, Delta Lake, and Delta Live Tables; or Snowflake equivalents (Horizon, Dynamic Tables, Streams & Tasks).
  • Experience with streaming / real-time data pipelines (Kafka, Event Hubs, Kinesis, Structured Streaming, Snowpipe Streaming) and/or IoT data patterns.
  • Working knowledge of Machine Learning (ML), Large Language Models (LLMs), and common AI/ML data enablement patterns (feature stores, vector stores, RAG).
  • Experience managing platform cost and performance — cluster/warehouse sizing, cost reporting, budgets, and alerting.
  • Experience administering a modern BI platform (Power BI, Tableau, Looker) — workspace governance, certified datasets, and best-practice enforcement.
  • Experience with Infrastructure as Code (Terraform, Bicep).
  • Experience with R, Scala, or other statistical / JVM-based programming languages.

Preferred Certifications

  • Databricks Certified Data Engineer Associate or Professional
  • SnowPro Core / SnowPro Advanced: Data Engineer
  • Microsoft Certified: Azure Data Engineer Associate
  • AWS Certified Data Engineer – Associate, or Google Cloud Professional Data Engineer

Soft Skills & Ways of Working

  • Strong written and verbal communication — able to explain technical concepts clearly to both tech

Similar Openings for You