Data Engineer

amgen

Hyderabad 4 Years Exp Posted 8d ago

Job Description

  • Design, develop, and maintain data solutions for data generation, collection, and processing   

  • Be a key team member that assists in design and development of the data pipeline  

  • Create data pipelines and ensure data quality by implementing ETL processes to migrate and deploy data across systems  

  • Contribute to the design, development, and implementation of data pipelines, ETL/ELT processes, and data integration solutions  

  • Take ownership of data pipeline projects from inception to deployment, manage scope, timelines, and risks  

  • Collaborate with cross-functional teams to understand data requirements and design solutions that meet business needs  

  • Develop and maintain data models, data dictionaries, and other documentation to ensure data accuracy and consistency  

  • Implement data security and privacy measures to protect sensitive data  

  • Leverage cloud platforms (AWS preferred) to build scalable and efficient data solutions  

  • Collaborate and communicate effectively with product teams  

  • Collaborate with Data Architects, Business SMEs, and Data Scientists to design and develop end-to-end data pipelines to meet fast-paced business needs across geographic regions  

  • Identify and resolve complex data-related challenges  

  • Adhere to best practices for coding, testing, and designing reusable code/component  

  • Explore new tools and technologies that will help to improve ETL platform performance  

  • Participate in sprint planning meetings and provide estimations on technical implementation  

  • Design and develop data pipelines leveraging Databricks, PySpark, and SQL to ingest, transform, and process large-scale datasets. 

  • Engineer solutions for both structured and unstructured data to enable advanced analytics and insights. 

  • Implement automated workflows for data ingestion, transformation, and deployment using Databricks Jobs and notebooks, with ongoing monitoring and scheduling. 

  • Apply performance optimization techniques, including Spark job tuning, caching, partitioning, and indexing, to improve scalability and efficiency. 

  • Build integrations with multiple data sources, such as SQL databases, APIs, and cloud storage platforms, ensuring seamless connectivity and reliability. 

    • Collaborate effectively with global teams across time zones to maintain alignment, resolve issues, and deliver on shared objectives. 

Similar Openings for You