Data Engineering Lead
hirist
Job Description
- Design and implement the AWS-based enterprise data lake architecture.
- Build scalable frameworks to handle structured, semi-structured, and unstructured datasets.
- Define standards for data ingestion, transformation, storage, and access.
- Ensure seamless integration with the Databricks analytics platform.
Data Ingestion & Integration :
Design and develop real-time and batch data ingestion pipelines for sources such as :
Internal systems :
- Trading & order management systems.
- Portfolio management platforms.
- Client onboarding / KYC systems.
- CRM platforms.
- ERP / accounting systems.
External sources :
- Market data vendors.
- Research and news feeds.
- Documents and reports.
- Audio or surveillance data.
Technologies :
- AWS AppFlow, AWS Lambda, AWS Glue, Amazon S3, Amazon Athena.
Real-Time Data Processing :
- Develop event-driven data pipelines to support near-real-time data ingestion.
- Enable real-time use cases such as : trading analytics, operational monitoring, compliance and surveillance analytics.
Security & Governance :
Ensure platform security and compliance through : AWS Key Management Service (KMS) for encryption, AWS Secrets Manager for credential management, AWS Security Hub for security monitoring, AWS Config for configuration governance, AWS CloudTrail for audit trails.
Monitoring & Observability :
- Implement monitoring frameworks using : AWS CloudWatch, Grafana dashboards.
- Monitor : pipeline performance, infrastructure health, data freshness, ingestion failures.
DevOps & Platform Automation :
- Implement CI/CD pipelines using GitLab.
- Automate deployment and testing of data pipelines.
- Establish standards for version control, code quality, and automated deployments.
Data Quality & Metadata :
- Implement frameworks for data validation, reconciliation, and monitoring.
- Manage metadata and data lineage using AWS Glue Data Catalog.
Integration with Databricks :
- Deliver curated and optimized datasets for analytics on Databricks.
- Collaborate with analytics teams to enable BI, advanced analytics, and ML workloads.
Hands-on Technical Leadership :
- Actively participate in pipeline development, architecture design, and technical problem solving.
- Provide technical guidance and code reviews to the data engineering team.
- Drive adoption of engineering best practices and reusable data frameworks.
Agile Delivery & Collaboration :
- Operate in an Agile / iterative development environment.
- Work closely with analytics teams, business stakeholders, and platform engineers.
- Deliver incremental data products and platform capabilities with rapid turnaround.
Key Technology Stack :
Cloud Platform : AWS.
Data Platform : Amazon S3, AWS Glue, Amazon Athena.
Data Ingestion : AWS AppFlow, AWS Lambda.
Security & Governance : AWS KMS, AWS Secrets Manager, AWS Security Hub, AWS Config, AWS CloudTrail.
Monitoring : AWS CloudWatch, Grafana.
DevOps : GitLab.
Analytics Platform : Databricks.
Programming : Python, PySpark, SQL.