Senior Data Engineer

weareroku

Bengaluru 10 Years Exp Posted 7h ago

Job Description

Big Data Engineering: 

  • Design, develop, and maintain data pipelines and ETL workflows using Apache Spark, Apache Airflow. 
  • Optimise data storage, retrieval, and processing systems to ensure reliability, scalability, and performance. 
  • Develop and fine-tune complex queries and data processing jobs for large-scale datasets. 
  • Monitor, troubleshoot, and improve data systems for minimal downtime and maximum efficiency. 

Software Development: 

  • Write clean, maintainable, and efficient code, ensuring adherence to best practices through code reviews. 

Collaboration & Mentorship: 

  • Partner with data scientists, software engineers, and other teams to deliver integrated, high-quality solutions. 
  • Provide technical guidance and mentorship to junior engineers, promoting best practices in data engineering. 

AI-augmented engineering & intelligent data interfaces: 

  • Apply modern AI-assisted development practices responsibly (for example assisted code review, test generation, and documentation) while maintaining production quality, security, and compliance standards. 
  • Design and evolve semantic search and retrieval over internal metadata (datasets, lineage, dashboards, runbooks): embeddings, indexing, and guardrailed query interfaces where they improve engineer and analyst productivity. 
  • Stay current on responsible AI expectations relevant to advertising data: privacy, PII handling, access control, auditability, and human-in-the-loop review for high-risk automation. 

We’re excited if you have 

  • Bachelor’s degree in computer science, Engineering, or a related field (or equivalent experience). 
  • 10+ years of experience in software and/or data engineering with expertise in big data technologies such as Apache Spark, Apache Airflow and Trino. 
  • Strong understanding of SOLID principles and distributed systems architecture. 
  • Proven experience in distributed data processing, data warehousing, and real-time data pipelines. 
  • Advanced SQL skills, with expertise in query optimisation for large datasets. 
  • Exceptional problem-solving abilities and the capacity to work independently or collaboratively. 
  • Excellent verbal and written communication skills. 
  • Experience with cloud platforms such as AWS, GCP, or Azure, and containerisation tools like Docker and Kubernetes. (preferred) 
  • Familiarity with additional big data technologies, including Hadoop, Kafka, and Trino. (preferred) 
  • Strong programming skills in Python, Java, or Scala. (preferred) 
  • Knowledge of CI/CD pipelines, DevOps practices, and infrastructure-as-code tools (e.g., Terraform). (preferred) 
  • Expertise in data modelling, schema design, and data visualisation tools. (preferred) 

Similar Openings for You