Software Engineer
coindcx
Job Description
We are hiring an SDE-1 Data Engineer (Individual Contributor) to execute high-quality data engineering work across ingestion pipelines, data quality, monitoring, curated datasets, and heavy third-party/vendor integrations.
This role is pure execution: writing code, fixing issues, adding validations, and ensuring reliable, timely data delivery.
You will work hands-on with Spark, Databricks, Python, Kafka, AWS (S3/EC2/Lambda) and internal CDC + ingestion frameworks.
What You’ll Do
1. Build & Enhance Data Pipelines (Internal + External Ingestion)
- Develop ingestion pipelines for internal data (CDC, service DBs).
- Build and maintain ingestion from external vendors & third parties including: Custody providers, Trading partners (TPE), Banking partners, External APIs (REST-based integrations)
- Handle pagination, rate limits, incremental loads, retries, and backoffs.
- Implement Spark-based transformations on Databricks.
2. Implement Data Quality Checks
- Add schema validations, field-level checks, null/boundary checks.
- Maintain ≥99% data quality for assigned datasets.
- Quickly identify & fix data mismatches caused by source/vendor changes.
3. Monitoring, Alerts & Observability
- Configure alerts for: Freshness, Latency, Data quality, Pipeline failures
- Add logs and metrics to improve troubleshooting.
- Ensure MTTR < 4 hours for failures.
4. Vendor & Third-Party Data Reliability
- Monitor vendor API health, schema changes, and data drift.
- Add defensive coding, retries, and fallback logic for unstable third-party feeds.
- Ensure <15 min data lag for assigned vendor connectors (where applicable).
- Keep documentation updated for all vendor integrations.
5. Curated Datasets Execution
- Build/modify curated datasets under guidance.
- Ensure 0 metric mismatches and correct business logic.
- Maintain documentation for dataset transformations.
6. Engineering Discipline & On-Time Delivery
- Maintain >80% test coverage and 0 PR hygiene rejections.
- Execute tasks with 95%+ on-time delivery.
- Provide crisp updates with minimal follow-ups.
You’ll Excel in This Role If You
Must-Have
- Python + SQL proficiency
- Basic to intermediate Spark/PySpark knowledge
- Experience with ETL / ingestion pipelines
- Understanding of APIs (GET/POST, tokens, pagination)
- Strong debugging skills
- Fast execution and high ownership
Good-to-Have
- Databricks experience
- Kafka basics
- Financial/transaction data exposure
- AWS fundamentals
- Experience integrating with third-party APIs
You’ll Know You’re Winning When
- 3–5 pipelines owned independently with ≥99.9% uptime
- External vendor integrations functioning reliably (<15 min freshness)
- ≥99% DQ on all assigned datasets
- Full alerting coverage on all owned pipelines
- MTTR consistently <4 hours
- 1–2 curated datasets delivered without metric errors
- 95%+ tasks delivered on-time, <10% rollovers
- Ability to independently debug Spark jobs, API integrations, schema mismatches, and ingestion failures
Why This Role Matters & What’s In It For You
- Very hands-on; you're writing production code daily
- Exposure to real financial system data flows
- Opportunity to learn vendor integrations, ingestion frameworks, Spark optimization
- Direct impact on company-wide data reliability
- Fast skill growth with clear path to SDE-2
Hiring Process
Here’s what your journey with us looks like:
- Application Review – We assess for skills, alignment, and intent
- Recruiter Connect – A short conversation to understand you better
- Functional Round(s) – Deep dive into your approach, craft, and problem-solving
- Culture & Values Discussion – A conversation to understand our ways of working and how you thrive best