Data Engineer III
myworkdayjobs
Job Description
Data Pipeline Development
- Design and build ETL/ELT pipelines to ingest, transform, and load data.
- Use AWS services like Glue, Lambda and Step Functions.
- Schedule and monitor workflows using Step Functions or AWS Glue Workflows.
- Create CI/CD Pipelines based on CloudFormation, Terraform, or CDK.
Data Storage and Management
- Manage structured and unstructured data using services like:
- Amazon S3
- Amazon Glue Catalogue
- Amazon DynamoDB
- Amazon RDS or Aurora
- Amazon Redshift
- Ensure data partitioning, indexing, and lifecycle management.
Data Integration
- Integrate data from multiple sources (APIs, on-prem databases, third-party tools).
- Use AWS Glue, Kinesis, or Kafka for real-time or batch data streaming.
Performance Optimization
- Tune ETL jobs and query performance on Athena or Databricks.
- Optimize storage formats (e.g., Parquet, ORC) for cost and speed.
Security & Compliance
- Implement data encryption, IAM roles, and access policies.
- Ensure compliance with data governance and privacy policies (GDPR).
- Use AWS Lake Formation and IAM for access control and auditing.
Monitoring & Maintenance
- Monitor pipeline health and performance using CloudWatch, CloudTrail, and custom dashboards.
- Set up alerts and logging for failures and anomalies.
Working with Data Scientists, Analysts, and BI teams to understand data needs.
Participate in Agile processes, sprint planning, and retrospectives.