Data Engineer
carrier
Job Description
Key Responsibilities:
- Work closely with data scientists, analysts, and business stakeholders to understand data requirements and objectives for machine learning and analytics projects.
- Design, develop, and maintain scalable and efficient data pipelines for transforming and cleansing raw data into ML-ready datasets.
- Implement data transformation logic and algorithms using tools such as PySpark, Apache Spark, or similar frameworks to pre-process and clean data.
- Utilize cloud-based data warehouse solutions such as Amazon Redshift to store and manage large volumes of structured and unstructured data.
- Collaborate with data architects and database administrators to optimize data models, schema designs, and query performance for analytics and reporting purposes.
- Ensure data quality and integrity by implementing data validation checks, error handling mechanisms, and monitoring processes throughout the data pipeline.
- Work with cross-functional teams to identify and address data integration and interoperability challenges, including data synchronization, data consistency, and data governance.
- Stay up-to-date with the latest advancements in data engineering, big data technologies, and machine learning techniques, and proactively apply new methodologies and best practices to improve data processing workflows.
Requirements
- Bachelor's degree or higher in Computer Science, Engineering, Mathematics, Statistics, or a related field. Advanced degree preferred.
- 7+ years of experience in experience candidate as back end - data engineer.
- Strong proficiency in Python and PySpark for data manipulation, analysis, and modelling. Experience with libraries such as Pandas, NumPy, TensorFlow, and PyTorch is highly desirable.
- techniques for anomaly detection, forecasting, and revenue optimization.
- Experience working with large-scale datasets and distributed computing frameworks for processing and analysing big data, such as Hadoop, Spark, and Databricks.
- Excellent communication skills and ability to effectively collaborate with cross-functional teams, including business stakeholders, data engineers, software developers, and product managers.
- Strong analytical and problem-solving skills, with a passion for using data-driven approaches to solve complex business problems and drive innovation.
- Strong experience in SQL.
- Experience in Docker and Openshift
- Hands on experience with REST Concepts.
- Required Skills Jira, Snowflake, Kafka, analytical, outcomes based, insights driven, REST APIs, Amazon Web Services, Strong SQL, Python programming, AWS, Python, Communication, OpenShift, REST, troubleshoot, Bitbucket, PySpark, Docker.