Data Architect
bnpparibas
Job Description
- The data architect contributes to the phases of the elaboration of a “Project Solution Document”:
- Reviewing business & project chapters and writing applicative, technical and infrastructure chapters
- Acting from Investmentcommitteepreparation including project defense in local & central IT project committees
- Support business and IT teams in the definition of the data strategy of the subsidiaryor the specific strategy for a product, by mean of conducting assessments, comparingsolutionsand presenting them to a broad audience (IT,business, top management, ...)
- Co-build new datacomponent
- Support the setup andthe deploymentof a new data product within an existing entity squad (Event, Data Preparation, Big Data platform).
- Interact with local/central IT and production teams under the responsibility of IT tech leador product manager
- The main responsibilities are covering functional and technical architecture layers, involving data
capabilities (ingestion, processing, storage, exchange, management, access)
- Define the vision of data product and interactions with IS
- Technology watch and advising
- Design functional and technical architecture to align with businessrequirementand value definition
- Support DevOps and IT delivery teams in the implementation
Contributing Responsibilities:
- Team player
- Adhere to the standards and practices followed in the Project
- Foster a culture of continuous learning and improvement within the team.
- Collaborate with cross-functional teams to understand data requirements and deliver solutions.
Technical & Behavioral Competencies
Technical Skills
- 10+ years of hands-on data engineering experience, with a strong focus on scalable, high-performance data systems.
- Expertisein distributed data processing frameworks (e.g., Spark Compute, Apache Airflow) and modern storage formats (Parquet/Iceberg, ORC).
- Proficiencyin Spark (Scala/Python/Java), Kafka, and Elasticsearch for real-time and batch data pipelines.
- Deep knowledge of relational (RDBMS) and NoSQL databases, with optimization experience for large-scale workloads.
- Hands-on experience with object storage (AWS S3, IBM COS) and data virtualization tools (e.g.,Dremio).
- Strong Airflow orchestration experience, including DAG design, scheduling, and monitoring.
- Familiarity with CI/CD pipelines (GitLab, Jenkins, or similar) for data workflow automation.
- Experience integrating Parquet/ORC formats with cloud/on-prem storage systems
- Nice to have: Working knowledge of HDFS, Hadoop and Hive