Senior Data Engineer
thinkproject
Job Description
Data Integration & Pipeline Development
- Design, implement, and optimise scalable data integration workflows supporting inference and data synchronisation across GCP services (Cloud Run, Pub/Sub, Cloud Storage, Cloud Spanner, Vertex AI)
- Build and maintain event-driven pipelines and ETL/ELT workflows that deliver clean, reliable data to the AI Search Platform
- Automate deployment, testing, and pipeline orchestration using Cloud Run, Pub/Sub triggers, and Terraform
API Development for AI Integration
- Build and maintain APIs that expose data integration and AI inference capabilities to internal and external systems
- Ensure secure, reliable, and performant access to the AI Search Platform — correct authentication, rate limiting, and error handling by default
Permissions & Compliance Layer
- Integrate and enforce API and IAM policies for compliant access control across all AI Search Platform components
- Own and evolve the permissions API layer to meet growing scalability and security requirements
Data Quality & Reliability
- Ensure data integrity through monitoring, validation, and alerting across all integrated systems and services
- Continuously monitor workflows for latency, reliability, and cost efficiency — implement
improvements without waiting to be asked
Documentation & Standards
- Maintain architecture documentation and runbooks
- Contribute to best practices for data integration, reproducibility, scalability, and security