Data Engineer

amgen

Hyderabad 4 Years Exp Posted 62d ago

Job Description

Roles & Responsibilities:

Design, develop, and maintain data solutions for data generation, collection, and processing  
Be a key team member that assists in design and development of the data pipeline 
Create data pipelines and ensure data quality by implementing ETL processes to migrate and deploy data across systems 
Contribute to the design, development, and implementation of data pipelines, ETL/ELT processes, and data integration solutions 
Take ownership of data pipeline projects from inception to deployment, manage scope, timelines, and risks 
Collaborate with cross-functional teams to understand data requirements and design solutions that meet business needs 
Develop and maintain data models, data dictionaries, and other documentation to ensure data accuracy and consistency 
Implement data security and privacy measures to protect sensitive data 
Leverage cloud platforms (AWS preferred) to build scalable and efficient data solutions 
Collaborate and communicate effectively with product teams 
Collaborate with Data Architects, Business SMEs, and Data Scientists to design and develop end-to-end data pipelines to meet fast-paced business needs across geographic regions 
Identify and resolve complex data-related challenges 
Adhere to best practices for coding, testing, and designing reusable code/component 
Explore new tools and technologies that will help to improve ETL platform performance 
Participate in sprint planning meetings and provide estimations on technical implementation 
Design and develop data pipelines leveraging Databricks, PySpark, and SQL to ingest, transform, and process large-scale datasets.
Engineer solutions for both structured and unstructured data to enable advanced analytics and insights.
Implement automated workflows for data ingestion, transformation, and deployment using Databricks Jobs and notebooks, with ongoing monitoring and scheduling.
Apply performance optimization techniques, including Spark job tuning, caching, partitioning, and indexing, to improve scalability and efficiency.
Build integrations with multiple data sources, such as SQL databases, APIs, and cloud storage platforms, ensuring seamless connectivity and reliability.
Collaborate effectively with global teams across time zones to maintain alignment, resolve issues, and deliver on shared objectives.

Basic Qualifications and Experience:

Bachelor’s / Master’s degree and 4 to 8 years of Computer Science, IT or related field experience

Functional Skills:

Must-Have Skills

Hands-on experience with big data technologies and platforms, such as Databricks, Apache Spark (PySpark, SparkSQL), workflow orchestration, performance tuning on big data processing 
Proficiency in data analysis tools (e.g. SQL) and experience with data visualization tools
Excellent problem-solving skills and the ability to work with large, complex datasets
Strong understanding of data governance frameworks, tools, and best practices.

Good-to-Have Skills:

Knowledge of data protection regulations and compliance requirements (e.g., GDPR, CCPA) processing 
Experience with ETL tools such as Apache Spark, and various Python packages related to data processing, machine learning model development 
Strong understanding of data modeling, data warehousing, and data integration concepts 
Knowledge of Python/R, Databricks, SageMaker, cloud data platforms 
Experience implementing automated orchestration and monitoring of data pipelines using Databricks Jobs, Apache Airflow, or similar workflow tools.
Familiarity with performance optimization techniques for big data processing, such as Spark job tuning, caching, partitioning, and indexing.
Exposure to multi-source integration involving APIs, SQL databases, and cloud storage platforms.
- Demonstrated ability to collaborate across global teams and time zones,

Data Engineer

Job Description

Similar Openings for You

Data Engineer

AI Data Foundation Engineer

Senior Data Engineer- Spark, Abinitio, Python, SQL, Data warehouse

Senior Software Engineer- Data Engineering