Data Engineer
amgen
Job Description
Roles & Responsibilities
- Design, develop, and maintain complex ETL/ELT data pipelines in Databricks using PySpark, Scala, and SQL to process large-scale datasets.
- Build highly efficient data pipelines to migrate and deploy complex data across systems, with an understanding of biotech/pharma/manufacturing or related domains.
- Design and implement solutions to enable unified data access, governance, and interoperability across hybrid cloud environments.
- Ingest and transform structured and unstructured data from databases (PostgreSQL, MySQL, SQL Server, MongoDB, etc.), APIs, logs, event streams, images, PDFs, and third-party platforms.
- Ensure data integrity, accuracy, and consistency through rigorous quality checks and monitoring.
- Innovate, explore, and implement new tools and technologies to enhance efficient data processing.
- Proactively identify and implement opportunities to automate tasks and develop reusable frameworks.
- Work in an Agile and Scaled Agile (SAFe) environment, collaborating with cross-functional teams, product owners, and Scrum Masters to deliver incremental value.
- Use JIRA, Confluence, and Agile DevOps tools to manage sprints, backlogs, and user stories.
- Support continuous improvement, test automation, and DevOps practices in the data engineering lifecycle.
- Collaborate and communicate effectively with product teams and cross-functional teams to understand business requirements and translate them into technical solutions.
Must-Have Skills
- Hands-on experience in data engineering technologies such as Databricks, PySpark, SparkSQL, Apache Spark, AWS, Python, FastAPI, Neo4j, SQL, and Scaled Agile methodologies.
- Develop RESTful APIs and microservices using FastAPI
- Proficiency in workflow orchestration and performance tuning on big data processing.
- Strong understanding of AWS services.
- Ability to quickly learn, adapt, and apply new technologies.
- Strong problem-solving and analytical skills.
- Excellent communication and teamwork skills.
- Experience with Scaled Agile Framework (SAFe), Agile delivery practices, and DevOps practices.
- Experience with streaming technologies such as Apache Kafka, Debezium, or similar platforms for real-time data processing and integration.