Senior Software Engineer

caterpillar

Bangalore 8 Years Exp Posted 2h ago

Job Description

  • Design, develop, and maintain scalable data pipelines on AWS using services such as S3, Glue, Lambda, Redshift, and EMR.
  • Build and optimize data warehousing solutions using Snowflake, including performance tuning and data modeling.
  • Write efficient and reusable code in Python and SQL for data transformation and processing.
  • Collaborate with cross-functional teams, including data scientists, analysts, and business stakeholders, to understand data requirements.
  • Develop and optimize solutions using graph databases (e.g., Neo4j, Amazon Neptune), including query design and performance tuning.
  • Design, build, and operate vector database solutions (e.g., Milvus, Amazon OpenSearch) to support semantic search, recommendations, RAG, and AI-driven use cases.
  • Integrate vector databases with LLM-based applications and AI workflows.
  • Monitor, troubleshoot, and improve pipeline performance and reliability.
  • Ensure data quality, integrity, and security across all stages of the pipeline.
  • Participate in code reviews, architecture discussions, and continuous improvement initiatives.

Required Qualifications

  • 8+ years of experience in data engineering or related roles.
  • Strong hands-on experience with AWS cloud services, including data and AI workloads.
  • Deep understanding of Snowflake architecture, performance tuning, and best practices.
  • Advanced proficiency in Python and SQL for data pipelines, transformations, and services.
  • Strong understanding of graph and vector data modelling concepts and their practical applications.
  • Hands-on experience with graph databases (e.g., Neo4j, Neptune) and vector databases (e.g., Milvus, Amazon OpenSearch).
  • Experience with version control systems (e.g., Git) and Git workflows.
  • Experience working with Azure DevOps (AzDO) boards for backlog management in Agile environments.
  • Excellent analytical and problem-solving skills.
  • Strong communication and collaboration abilities.
  • Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field.

Nice to Have skills

  • Knowledge of the NVIDIA ecosystem and its applications in data and AI.

Preferred Qualifications

  • Experience with orchestration tools such as AWS Step Functions.
  • Familiarity with data governance and compliance practices.
  • Exposure to real-time data processing frameworks (e.g., Kafka, Spark Streaming).

Mode detail on Knowledge Base

  • Experience designing and deploying data ingestion pipelines for unstructured sources such as PDFs, Word documents, and HTML files, including text extraction, chunking strategies, and embedding generation at scale.
  • Hands-on expertise with vector databases, specifically Milvus, covering schema design, indexing, and optimizing write performance for large-scale embedding ingestion pipelines.
  • Proficiency in building Knowledge Graph ingestion pipelines using Graph Databases — including entity extraction, relationship modelling, and populating nodes and attributes.
  • Strong pipeline engineering skills in Python and frameworks for orchestrating multi-stage document processing workflows, with experience deploying and monitoring these pipelines in production environments.
    • Bonus: Exposure to RAPIDS libraries (cuDF, cuML, cuGraph) or CUDA-based tooling for GPU-accelerated data processing, enabling faster transformation and optimization during large-scale ingestion workflows.

Similar Openings for You