Engineer, Software Engineering
spglobal
Job Description
Key Responsibilities
- Design and develop robust data extraction pipelines for PDFs, HTML, and semi-structured/unstructured data sources
- Implement and optimize ML/NLP models for information extraction and document understanding
- Work with Large Language Models (LLMs) for tasks like classification, summarization, and structured extraction
- Build scalable backend services using Python
- Develop and maintain APIs and microservices
- Collaborate with frontend engineers to integrate features (React-based UIs)
- Work with databases and search systems for indexing and retrieval
- Ensure high performance, reliability, and scalability of systems
- Debug and improve existing pipelines and workflows
Required Skills
Strong programming skills in Python
Experience working with data processing pipelines
Hands-on experience with NLP / ML techniques
Familiarity with LLMs and prompt engineering / fine-tuning concepts
Experience parsing or extracting data from PDFs / HTML / web sources
Knowledge of REST APIs and backend development
Experience with databases (SQL/NoSQL)
- Good understanding of software engineering fundamentals (data structures, system design, testing)
Nice to Have
Experience with document AI / OCR tools
Familiarity with frameworks like FastAPI
Experience with React or frontend development
Knowledge of distributed systems / streaming pipelines
- Exposure to vector databases / embeddings / semantic search
- Experience deploying ML models to production
- Understanding of cloud platforms (AWS / GCP / Azure)
What We Offer
- Opportunity to work on cutting-edge problems involving LLMs and document intelligence
- High ownership and impact on core product systems
- Collaborative and fast-paced engineering culture
- Flexible work environment