Senior Data Engineer

amgen

Hyderabad 4 Years Exp Posted 1h ago

Design, develop, and maintain scalable ETL/ELT pipelines to support structured, semi-structured, and unstructured data processing across the Enterprise Data Engineering for Biotech or Pharma functional knowledge of R&D.
Implement real-time and batch data processing solutions, integrating data from multiple sources into a unified, governed data fabric architecture.
Optimize big data processing frameworks using Apache Spark, Hadoop, or similar distributed computing technologies to ensure high availability and cost efficiency.
Work with metadata management and data lineage tracking tools to enable enterprise-wide data discovery and governance.
Ensure data security, compliance, and role-based access control (RBAC) across data environments.
Optimize query performance, indexing strategies, partitioning, and caching for large-scale data sets.
Develop CI/CD pipelines for automated data pipeline deployments, version control, and monitoring.
Implement data virtualization techniques to provide seamless access to data across multiple storage systems.
Collaborate with cross-functional teams, including data architects, business analysts, and DevOps teams, to align data engineering strategies with enterprise goals.
Stay up to date with emerging data technologies and best practices, ensuring continuous improvement of Enterprise Data Fabric architectures.

Must-Have Skills:

Hands-on experience in data engineering technologies such as Databricks, PySpark, SparkSQL Apache Spark, AWS, Python, SQL, and Scaled Agile methodologies.
Proficiency in workflow orchestration, performance tuning on big data processing.
Strong understanding of AWS services
Experience with Data Fabric, Data Mesh, or similar enterprise-wide data architectures.
Ability to quickly learn, adapt and apply new technologies
Strong problem-solving and analytical skills
Excellent communication and teamwork skills
Experience with Scaled Agile Framework (SAFe), Agile delivery practices, and DevOps practices.

Good-to-Have Skills:

Good to have deep expertise in Biotech & Pharma industries
Experience in writing APIs to make the data available to the consumers
Experienced with SQL/NOSQL database, vector database for large language models
Experienced with data modeling and performance tuning for both OLAP and OLTP databases
Experienced with software engineering best-practices, including but not limited to version control (Git, Subversion, etc.), CI/CD (Jenkins, Maven etc.), automated unit testing, and Dev Ops

Education and Professional Certifications

Master’s degree and 3 to 4 + years of Computer Science, IT or related field experience
OR
Bachelor’s degree and 5 to 8 + years of Computer Science, IT or related field experience
AWS Certified Data Engineer preferred
Databricks Certificate preferred
- Scaled Agile SAFe certification preferred