Senior Data Engineer

taleo

Hyderabad 5 Years Exp Posted 17d ago

Job Description

  • Build scalable data ingestion pipelines for relational, semi-structured, and unstructured data sources
    Design, implement, and optimize lakehouse architectures using Apache Iceberg
  • Optimize table design including partitioning, compaction, schema evolution, and performance tuning for Iceberg datasets
  • Implement best practices for versioning, time travel, incremental processing, and ACID compliance
  • Develop and optimize Apache Spark (batch and streaming) jobs for large-scale data processing
  • Work extensively with AWS services such as Glue, EMR, Lambda, Step Functions, and S3 with a focus on cost and performance optimization
  • Build and manage real-time data pipelines using Kafka and Kafka Streaming
  • Design and orchestrate workflows using DBT and Airflow
  • Implement automated data quality checks, validation frameworks, and error monitoring mechanisms
  • Establish observability frameworks including monitoring, logging, and alerting for data pipelines
  • Collaborate with analytics/reporting teams to enable data quality dashboards and reporting
  • Analyze existing pipelines to identify improvements and enhance reliability and scalability
  • Leverage AI/LLM-based tools to accelerate ETL/ELT development, validation, and debugging
  • Participate in code reviews and contribute to best practices and engineering standards

 

Skills

  • Bachelor’s degree (or higher) in Computer Science, Engineering, or a related technical field
  • 5+ years of experience designing, building, and maintaining data pipelines
  • Strong programming skills in SQL, Python, and Apache Spark
  • Hands-on experience with AWS data services (Glue, EMR, S3, Lambda, Step Functions)
  • Deep understanding of lakehouse architectures and Apache Iceberg
  • Experience with DBT and Airflow for data transformation and orchestration
  • Strong experience with Kafka and real-time streaming pipelines
  • Experience working with Snowflake as a cloud data warehouse
  • Strong understanding of data quality frameworks, validation, and monitoring
  • Experience handling structured, semi-structured, and unstructured data at scale
  • Solid understanding of distributed systems and data engineering best practices
  • Experience with CI/CD pipelines and automation (preferred)
  • Strong problem-solving skills and ability to work in a fast-paced environment
    • Excellent communication skills and ability to collaborate with cross-functional teams

Similar Openings for You