Data Engineer - Big Data Technologies

hirist

Chennai, India 10 Years Exp Posted 22d ago

Job Description

Responsibilities :

 

- Design, develop, and maintain robust and scalable data pipelines using Apache Spark and Scala on the Databricks platform.

 

- Implement ETL (Extract, Transform, Load) processes for various data sources, ensuring data quality, integrity, and efficiency.

 

- Optimize Spark applications for performance and cost-efficiency within the Databricks environment.

 

- Work with Delta Lake for building reliable data lakes and data warehouses, ensuring ACID transactions and data versioning.

 

- Collaborate with data scientists, analysts, and other engineering teams to understand data requirements and deliver solutions.

 

- Implement data governance and security best practices within Databricks.

 

- Troubleshoot and resolve data-related issues, ensuring data availability and reliability.

 

- Stay updated with the latest advancements in Spark, Scala, Databricks, and related big data technologies.

 

Required Skills and Experience :

 

- Proven experience as a Data Engineer with a strong focus on big data technologies.

 

- Expertise in Scala programming language for data processing and Spark application development.

 

- In-depth knowledge and hands-on experience with Apache Spark, including Spark SQL, Spark Streaming, and Spark Core.

 

- Proficiency in using Databricks platform features, including notebooks, jobs, workflows, and Unity Catalog.

 

- Experience with Delta Lake and its capabilities for building data lakes.

 

- Strong understanding of data warehousing concepts, data modeling, and relational databases.

 

- Familiarity with cloud platforms (e.g., AWS, Azure, GCP) and their data services.

 

- Experience with version control systems like Git.

 

- Excellent problem-solving and analytical skills.

 

- Ability to work independently and as part of a team.

 

Preferred Qualifications (Optional) :

 

- Experience with other big data technologies like Kafka, Flink, or Hadoop ecosystem components.

 

- Knowledge of data visualization tools.

 

- Understanding of DevOps principles and CI/CD pipelines for data engineering.

 

- Relevant certifications in Spark or Databrick

 

 

Similar Openings for You