Senior Data Engineer

msd

Hyderabad NM Years Exp Posted 5h ago

Job Description

  • Based in Hyderabad, join a global healthcare biopharma company and be part of a 130- year legacy of success backed by ethical integrity, forward momentum, and an inspiring mission to achieve new milestones in global healthcare.
  • Be part of an organisation driven by digital technology and data-backed approaches that support a diversified portfolio of prescription medicines, vaccines, and animal health products.
  • Drive innovation and execution excellence. Be a part of a team with passion for using data, analytics, and insights to drive decision-making, and which creates custom software, allowing us to tackle some of the world's greatest health threats.

Our Technology Centers focus on creating a space where teams can come together to deliver business solutions that save and improve lives. An integral part of our company's’ IT operating model, Tech Centers are globally distributed locations where each IT division has employees to enable our digital transformation journey and drive business outcomes. These locations, in addition to the other sites, are essential to supporting our business and strategy.

A focused group of leaders in each Tech Center helps to ensure we can manage and improve each location, from investing in growth, success, and well-being of our people, to making sure colleagues from each IT division feel a sense of belonging to managing critical emergencies. And together, we must leverage the strength of our team to collaborate globally to optimize connections and share best practices across the Tech Centers.

Role Overview

We are looking for a highly motivated Data Engineer to build and maintain scalable, high-performance data pipelines. The ideal candidate will have strong expertise in AWSSQLPythonApache Spark, and Apache Airflow, along with hands-on experience in Databricks as a core data processing platform. Along with exposure to Agentic AI systems and AI-driven data workflows.

What will you do in this role

  • Design, develop, and maintain ETL/ELT pipelines using SQL, Python, and Spark
  • Build and manage data workflows using Apache Airflow for orchestration and scheduling
  • Develop scalable and optimized solutions using AWS services (S3, Glue, Redshift, EMR, Lambda, etc.)
  • Implement and manage data processing pipelines in Databricks (Delta Lake, notebooks, workflows, Unit Catalog)
  • Ensure data quality, reliability, and performance across pipelines
  • Collaborate with analytics, product, and business teams to deliver data solutions
  • Monitor, troubleshoot, and optimize production pipelines

What should you have

  • Strong proficiency in SQL and Python
  • Hands-on experience with Apache Spark (PySpark preferred)
  • Experience working with Apache Airflow for workflow orchestration
  • Solid experience with AWS cloud platform. Redshift performance optimization skills
  • Hands-on experience in Databricks
  • Understanding of data warehousing, data modeling, and ETL design

🔹 Good to Have

  • Experience with CI/CD pipelines and GitHub Actions
  • Knowledge of Pharmaceutical / Life Sciences domain
  • Familiarity with data governance and quality frameworks
    • Exposure to Docker, Kubernetes, or similar technologies

Similar Openings for You