Big Data Developer
griddynamics
Job Description
Resposibilities
-
Build and optimize big data solutions on AWS services such as S3, EMR, Glue, Athena, and EKS/Kubernetes.
-
Develop, schedule, and monitor workflow orchestration pipelines using Apache Airflow.
-
Execute and manage Spark jobs on Kubernetes/EKS environments ensuring performance, scalability, and reliability.
-
Implement and maintain data lake architectures leveraging Apache Iceberg for efficient data management and governance.
-
Collaborate with cross-functional teams including Data Architects, Analysts, and Business stakeholders to understand data requirements and deliver robust solutions.
-
Optimize Spark workloads, query performance, and resource utilization for large-scale datasets.
-
Ensure data quality, security, consistency, and adherence to best practices across data platforms.
-
Troubleshoot production issues, perform root cause analysis, and provide timely resolutions.
-
Contribute to CI/CD implementation, automation, and infrastructure improvements for data engineering platforms.
-
Work with Hadoop ecosystem components such as YARN, HDFS, and Hive for data storage and processing when required.
-
Participate in code reviews, technical discussions, and knowledge-sharing sessions within the team.
Qualifications
Desired data engineering skills:
-
Experience with AWS (S3, EMR, running Spark jobs on K8s or EKS, Glue, Athena)
-
Extensive experience with Apache Spark
-
Hands-on experience in Python
-
Experience with Apache Airflow
-
Experience with Apache Iceberg
-
Hands-on experience with Hadoop stack (YARN, HDFS, Hive) will be a plus
We offer
-
Opportunity to work on bleeding-edge projects
-
Work with a highly motivated and dedicated team
-
Competitive salary
-
Flexible schedule
-
Benefits package - medical insurance, sports
-
Corporate social events
-
Professional development opportunities
-
Well-equipped office
-