Senior Software Engineer – Data Engineer

njoyn

Hyderabad 6 Years Exp Posted 32d ago

Job Description

Key Responsibilities:
• Design, develop, and maintain scalable big data solutions using the Hadoop ecosystem.
• Develop and optimize data processing pipelines using PySpark and Spark frameworks.
• Work extensively with HDFS for distributed data storage and management.
• Develop complex queries and data transformations using Hive and Impala.
• Monitor and manage workloads on YARN resource manager.
• Build and maintain ETL pipelines to ingest, process, and transform large volumes of structured and unstructured data.
• Optimize Spark jobs and Hive queries for performance and scalability.
• Collaborate with data scientists, analysts, and engineering teams to deliver data-driven solutions.
• Ensure data quality, reliability, and performance across data platforms.
• Troubleshoot and resolve issues related to cluster performance and data processing workflows.

Required Skills:
• Strong experience with Core Hadoop ecosystem:
• Proficiency in Python programming.
• Hands-on experience with Apache Spark and PySpark for large-scale data processing.
• Experience in writing optimized Hive queries and data transformations.
• Understanding of distributed computing and big data architecture.
• Experience working with Linux/Unix environments.
• Knowledge of data ingestion frameworks and ETL pipelines.
• Familiarity with performance tuning and troubleshooting Spark/Hadoop jobs.

Good to Have:
• Experience with Kafka, Sqoop, or Flume.
• Exposure to cloud platforms (AWS, Azure, GCP).
• Knowledge of Airflow or other workflow orchestration tools.
• Experience with data warehousing concepts and data lakes.
 

Similar Openings for You