Data Engineer

bnpparibas

Chennai, India 4 Years Exp Posted 2h ago

Direct Responsibilities:

Migrate the existing Hadoop infrastructure to cloud infrastructure on Kubernetes Engine, COS, Spark as a service, and Airflow as a service.
Implement data transformation and quality to ensure data consistency and accuracy. Utilize programming languages such as Scala and SQL and tools like Spark for data transformation and enrichment operations.
Set up CI/CD pipelines to automate deployments, unittestingand development management.
Write and conduct unit and validation tests to ensure accuracy and integrity of code developed.
Automate data pipelines and streamline data ingestion through the implementation of different orchestrators and scheduling processes (Airflow as aService mainly).
Writing technical documentation (specifications, operational documents) to ensureknowledgecapitalization.

Contributing Responsibilities:

Foster a culture of continuous learning and improvement within the team.
Collaborate with cross-functional teams to understand data requirements and deliver solutions.

Technical & Behavioral Competencies

Technical Skills

At least5 years of working experience inDataengineering
Working experience onSpark on Scala/Python/ Java(any of these languages)
KnowledgeonApache Airflow,Oozieor any other similar schedulertools
Strong knowledge of SQL and NoSQL databases
Good exposure to CI/CD tools (Gitlab, Jenkins…)
Knowledge of Kubernetes containerization
Integration experience with S3 storage/COS and parquet (and ORC) format
Hands onknowledge ofUnix shell script
Design effective prompts toleverageGen AI tools across IT domains (e.g., development, testing, data generation, documentation) during the development stage.
Nice tohave: exposureonany of the datavirtualizationtoollikeDremio
Nice tohave:exposure to Kafka, Elastic search, Kibana,Hvault
- Nice to have: Working knowledge of HDFS, Hadoop and Hive