Data Engineer AI

thermofisher

Bangalore NM Years Exp Posted 263d ago

Responsibilities:

Perform data analysis, exploration, and preparation to support business use cases.
Build and develop data pipelines in Python/Databricks and collaborate with cloud data platforms.
Apply Generative AI techniques to compose prototypes, develop insights, and improve solutions.
Work with diverse teams to grasp requirements and convert them into data-driven/AI-driven results.
Conduct unit testing, validation, and data quality checks for datasets, pipelines, and AI outputs.
Build and develop RESTful APIs to enable communication between different software components.
Follow agile and DevOps practices for continuous development, testing, and deployment.
Contribute to data governance, documentation, and guidelines for balanced data and AI solutions.

Strong proficiency in Python with 6+ years of experience, with a focus on both data engineering (pandas, PySpark, Databricks) and AI (transformers, LLM frameworks).
Proficient knowledge of SQL and capability to handle relational and non-relational data.
Familiarity with cloud platforms (AWS, Azure, or GCP) and data lake/warehouse concepts.
Exposure to Generative AI tools and frameworks (e.g., LangGraph, LangChain, OpenAI APIs).
Awareness of timely engineering, fine-tuning, embeddings, and LLM evaluation.
Familiarity with BI/visualization tools (Power BI, Tableau, or similar).
Understanding of modern practices like DataOps, MLOps or equivalent experience, and AI/GenAI Ops.
Strong analytical and problem-solving skills with the ability to handle ambiguity.
Excellent written, verbal, and interpersonal communication skills.
Skill in collaborating efficiently with international teams spanning various regions and time zones.
Curiosity to explore and implement evolving data engineering and Generative AI technologies.