Software Engineering Professional
bt
Job Description
What you’ll be doing
Design and Development
• Design and implement RESTful and gRPC service APIs for Agent Registry, Workspace Manager, Policy Manager, and Cost Manager following domain-driven design principles
• Build Cedar policy integration layer — translate business policy rules into Cedar ABAC/RBAC expressions, implement dry-run simulator endpoint, manage policy versioning and rollback
• Implement SCIM 2.0 protocol endpoints for Azure AD user and group provisioning with idempotent upsert semantics and reconciliation jobs
• Develop event-sourced audit log producers — every state change emits a Kafka event with SHA-256 hash chain continuation for tamper-evident logging
• Build agent lifecycle FSM enforcement in Agent Registry — Cedar-guarded state transitions (Draft → Validated → Staged → Active → Deprecated → Archived) with full transition history
• Implement cost attribution pipeline consumers — read Kafka cost events, compute micropence-precision attribution across the org/workspace/team/agent/invocation hierarchy, persist to ClickHouse
• Design and implement NATS JetStream consumers for real-time policy invalidation — Cedar cache flush must propagate platform-wide within 5 seconds of a policy change
• Write unit tests (≥80% coverage), integration tests with Test containers, and contract tests for all API surfaces
Data and Persistence
• Design PostgreSQL schemas with row-level security for multi-tenant isolation — all entities scoped by org_id and workspace_id
• Write CockroachDB-compatible SQL for strongly consistent global metadata — agent manifests, Cedar policies, IAM records
• Implement Redis-backed distributed locking and caching patterns for budget enforcement counters (atomic INCR operations) and prompt cache management
• Write ClickHouse analytical queries for cost attribution rollup, RAGAS evaluation trending, and audit log search
Integration and Security
• Integrate Spring Security with JWT validation (Keycloak-issued tokens) and Cedar policy evaluation on every protected endpoint
• Implement Azure AD SCIM 2.0 webhook receiver with signature validation, idempotency, and retry handling
• Build Vault dynamic secret client — request tool credentials at runtime, handle lease renewal and rotation without pod restart
Implement data residency enforcement — workspace region tag propagates to all downstream LLM routing and storage decisions via Cedar conditions
Essential Skills / Experience
Core (Java / Spring Boot)
• Java 17+ - records, sealed classes, virtual threads (Project Loom), structured concurrency
• Spring Boot 3.x - Spring Data JPA, Spring Security, Spring AMQP, Spring Batch for data pipeline jobs
• Spring Security - JWT token validation, method-level security, custom filter chains for Cedar integration
• JPA / Hibernate - multi-tenancy patterns, discriminator columns, schema-per-tenant, entity graphs, query optimisation
• Maven / Gradle - multi-module project structure, dependency management, reproducible builds
• Test containers - integration testing with real PostgreSQL, CockroachDB, Redis, Kafka instances
• Micrometer - custom metrics, histogram percentiles, Dynatrace OTLP export
Domain and Architecture
• Domain-Driven Design - bounded contexts, aggregates, repositories, domain events, anti-corruption layers
• Event sourcing - event store design, event replay, snapshot strategy, eventual consistency handling
• CQRS - command and query responsibility segregation, read model projections from Kafka event streams
• API design - RESTful resource modelling, OpenAPI 3.x spec-first development, backward compatibility, versioning
• gRPC - Protobuf schema design, server and client streaming, interceptors, error propagation
• Distributed transactions - Saga pattern with compensating actions, outbox pattern for reliable event publishing
Messaging and Streaming
• Apache Kafka - producer/consumer patterns, exactly-once semantics, topic partitioning strategy, consumer group manage