Data Engineer

soothsayeranalytics

Hyderabad 4 Years Exp Posted 66d ago

Key Responsibilities

Data Pipeline Development:

·Build and maintain scalable ETL/ELT pipelines for structured and unstructured data

·Ingest data from diverse sources (APIs, streaming, batch systems).

Data Modeling & Warehousing

·Design efficient data models to support analytics and AI workloads.

·Develop and optimize data warehouses/lakes using Redshift, BigQuery, Snowflake, or Delta Lake.

Big Data & Streaming

·Work with distributed systems like Apache Spark, Kafka, or Flink for real-time/large-scale data processing.

·Manage feature stores for ML pipelines

Collaboration & Best Practices

·Work closely with Data Scientists and ML Engineers to ensure high-quality training data.

·Implement data quality checks, observability, and governance frameworks.

Required Skills & Qualifications

Education:Bachelor’s/Master’s in Computer Science, Data Engineering, or related field.

Experience: 4–6 years in data engineering with expertise in:

·Programming: Python/Scala/Java (Python preferred).

·Big Data & Processing: Apache Spark, Kafka, Hadoop.

·Databases: SQL/NoSQL (Postgres, MongoDB, Cassandra).

·Data Warehousing: Snowflake, Redshift, BigQuery, or similar.

·Orchestration: Airflow, Luigi, or similar.

·Cloud Platforms: AWS, Azure, or GCP (data services).

·Version Control & CI/CD: Git, Jenkins, GitHub Actions.

·MLOps/GenAI pipelines: (feature engineering, embeddings, vector DBs)