Data Lake / ETL Engineer
ismartrecruit
Job Description
Key Responsibilities
-
Pipeline Development
-
Build and maintain ETL/ELT pipelines for structured and semi-structured data.
-
Support data ingestion from databases, APIs, streaming platforms, and flat files.
-
Ensure data quality, integrity, and lineage across data flows.
-
-
Data Lake Engineering
-
Assist in the design and development of data lake solutions on cloud and on-prem.
-
Implement storage and retrieval mechanisms optimized for performance.
-
Manage metadata and cataloging for discoverability and governance.
-
-
Performance & Optimization
-
Tune ETL workflows for efficiency and cost-effectiveness.
-
Implement partitioning, indexing, and caching for large-scale data processing.
-
Automate repetitive data preparation tasks.
-
-
Collaboration & Support
-
Work with data scientists and analysts to deliver clean and reliable datasets.
-
Collaborate with senior engineers on best practices for data modeling and pipeline design.
-
Provide L2 support for production pipelines and help troubleshoot failures.
-
Required Skills & Experience
-
2+ years of experience in data engineering or ETL development.
-
Proficiency in SQL and Python (or Scala/Java) for data transformations.
-
Hands-on with ETL tools (Informatica, Talend, dbt, SSIS, Glue, or similar).
-
Exposure to big data technologies (Hadoop, Spark, Hive, Delta Lake).
-
Familiarity with cloud data platforms (AWS Glue/Redshift, Azure Data Factory/Synapse, GCP Dataflow/BigQuery).
-
Understanding of workflow orchestration (Airflow, Oozie, Prefect, or Temporal).
Preferred Knowledge
-
Experience with real-time data pipelines using Kafka, Kinesis, or Pub/Sub.
-
Basic understanding of data warehousing and dimensional modeling.
-
Exposure to containerization and CI/CD pipelines for data engineering.
-
Knowledge of data security practices (masking, encryption, RBAC).
Education & Certifications
-
Bachelor’s degree in Computer Science, IT, or related field.
-
Preferred certifications:
-
AWS Data Analytics – Specialty / Azure Data Engineer Associate / GCP Data Engineer.
-
dbt or Informatica/Talend certifications.
-
-