AI/ML Data Engineer | Unstructured Data, PySpark, Python, Vector Search, RAG Architectures, Cloud (GCP/AWS) Apply Logo
myworkdayjobs
Job Description
-
Build and maintain scalable, robust data pipelines for unstructured content, ensuring high data quality and performance efficiency
-
Develop algorithms for document classification, cleansing, and enrichment to feed AI/ML systems
-
Integrate data workflows with LLM pipelines supporting RAG architectures for semantic search and Question-Answering (QA) systems
-
Engineer and optimize vector embeddings, document chunking, and metadata tagging for AI applications
-
Collaborate closely with AI architects, data scientists, and platform teams to design end-to-end AI solutions
-
Implement automation, monitoring, and security best practices to ensure system reliability and compliance
-
Support project lifecycle activities, including proof-of-concept, testing, deployment, and ongoing monitoring
-
Share domain expertise, conduct knowledge sharing, and mentor team members