Data Engineer
reverencedata
Job Description
Responsibilities
- Design, develop, and maintain data pipelines to process and integrate large volumes of structured and unstructured data from various sources.
- Implement ETL/ELT processes and data integration workflows to ensure smooth data flow across systems and into data warehouses, lakes, and other storage solutions.
- Optimize and monitor data pipelines to ensure data quality, integrity, and performance, resolving data-related issues as they arise.
- Collaborate with cross-functional teams (e.g., data scientists, analysts, product teams) to understand data needs, define data requirements, and deliver high-quality data solutions.
- Develop and maintain documentation for data pipelines, data sources, and data integration processes.
- Maintain and enhance data infrastructure to support data ingestion, transformation, and loading, ensuring system reliability and scalability.
- Participate in database and data warehouse design to ensure optimal performance, maintainability, and scalability.
- Perform data profiling and data quality checks to identify and resolve issues within the data pipeline.
- Stay up to date with industry best practices and new technologies to improve current data engineering practices.
Job requirements
- Bachelor’s degree in computer science, Engineering, Data Science, or a related field.
- 3-5 years of experience in data engineering or a similar role, with proven experience in designing, building, and managing data pipelines and workflows.
- Strong programming skills in Python, SQL, and experience with data pipeline and ETL tools (e.g., Apache Spark, Airflow, Talend, or similar).
- Proficiency in working with relational and non-relational databases (e.g., PostgreSQL, MySQL, MongoDB).
- Experience with cloud platforms (e.g., AWS, Azure, Google Cloud Platform) and familiarity with cloud-based data storage and processing services (e.g., S3, Redshift, Big Query).
- Hands-on experience with data warehousing solutions and data modeling best practices.
- Familiarity with Big Data tools and frameworks, such as Hadoop, Hive, and Kafka, is a plus.
- Strong problem-solving skills and the ability to troubleshoot data-related issues efficiently.
- Excellent communication skills, with the ability to translate technical requirements for non-technical stakeholders.
- Detail-oriented, with a focus on data quality, testing, and validation.