Data Engineer

celestica

Chennai, India 10 Years Exp Posted 24d ago

Job Description

Detailed Description

  • Requirement Management: Facilitate deep‑dive sessions with business and technology leaders to gather and manage technical requirements, ensuring data initiatives align with strategic organizational goals.
  • Data Ingestion & Integration: Design and implement connectors and ingestion jobs that pull data from multiple tools (e.g., Azure DevOps, Jira, test systems, logs) into the central data lake.
  • Data Curation, Metadata Management & Governance: Lead efforts to clean and curate complex datasets. Develop and maintain data catalogues, metadata, and data mapping documentation to ensure transparency, lineage, and trust across the ecosystem.
  • End‑to‑End Pipeline & ETL Development: Architect, implement, and optimize scalable ETL/ELT processes, moving data from diverse sources into enterprise‑grade data lake and warehouse environments.
  • Automation & Python Scripting: Leverage Python to automate data validation, monitoring, and repetitive workflows, improving the efficiency and reliability of the data ecosystem.
  • Advanced Analytics & Visualization: Design and implement dashboards and reports using platforms like Power BI or Tableau to provide actionable insights to stakeholders at all levels.
  • Data Integrity & Security: Implement rigorous data quality checks, monitoring, and security controls to maintain accuracy, compliance, and “gold‑standard” data health.
  • Cross‑Functional Collaboration: Partner with data scientists and other teams to understand their access patterns, ensure datasets are documented and queryable, and streamline export of curated tables for downstream ML and analytics.

Knowledge/Skills/Competencies

Mandatory Skills and Qualifications

  • Experience: 10–15 years of hands‑on experience in data engineering or closely related roles.
  • Data Engineering & Solutions: Proven experience building high‑performance data pipelines and implementing end‑to‑end data solutions in a production environment.
  • Data Lake & Data Catalogue Experience: Hands‑on experience designing or operating a centralized data lake and data catalogue, including dataset onboarding, partitioning, and access patterns for analytics and ML.
  • Python Proficiency: Strong coding skills for data manipulation (e.g., Pandas, NumPy), automation scripts, and API integrations.
  • SQL Mastery: Advanced skills in SQL for complex querying, data modeling, and performance tuning.
  • Data Curation & Metadata Management: Hands‑on experience in data cleansing, mapping, metadata management, and the creation/maintenance of data catalogues.
  • Visualization Mastery: Professional‑level experience with Power BI, Tableau, or similar BI platforms to create compelling, user‑centric data stories.
  • Requirement Management: Demonstrated ability to lead requirement‑gathering sessions and manage the lifecycle of data initiatives with both technical and non‑technical stakeholders.

Preferred Qualifications

  • Cloud & DevOps: Familiarity with major cloud platforms (Azure, GCP) and CI/CD practices for data applications using tools like Git.
  • AI Support: Experience preparing and managing data specifically for machine learning models and AI‑driven analytics (e.g., feature engineering, labeling, dataset versioning).

Physical Demands

  • Duties of this position are performed in a normal office environment.
  • Duties may require extended periods of sitting and sustained visual concentration on a computer monitor or on numbers and other detailed data. 
  • Repetitive manual movements (e.g., data entry, using a computer mouse, using a calculator, etc.) are frequently required.
    • Occasional travel may be required.

Similar Openings for You