← Back to portfolio
★ Guest favourite · 99.9% uptime

Dose
Management

Radiopharmaceutical Platform · Eli Lilly · Feb 2024 - Aug 2024
KUBED · ProdScheduleManufacturing · Decay-adjusted●●●
Orders
  • Today · 47 doses
  • Tomorrow · 52
  • Backlog · 9
Compliance
  • RAM license
  • Audit log
  • Chain of custody
F-18 FDG · MCI hospitalT-3h
Ga-68 PSMA · MemorialT-5h
Lu-177 · Mt SinaiT-9h
Tc-99m · Children'sT-2h
I-131 · ClevelandT-7h
Sai

Project Overview

Radiopharmaceutical drugs are time bombs. Lu-177 has a half-life of 6.6 days; F-18 FDG, only 110 minutes. By the time a dose travels from factory to patient, half of it may already be gone. The dose-management system orchestrates manufacturing, decay calculations, and delivery scheduling so that the right activity arrives at the right patient at the right time — every time, with FDA paper trail.

Spring Boot MicroservicesReact + TypeScriptKUBED (Kubernetes)Istio Service MeshTemporal WorkflowsCrossplane IaCGrafana · Loki · CloudWatchArgoCD · GitHub Actions

Problem Statement

Three constraints fought each other every day:

  1. Decay never sleeps.A dose that's perfect at 8 AM is sub-therapeutic at 10. Scheduling has to account for the half-life of every isotope and the transit time to each hospital.
  2. Regulatory load. Every dose touches RAM licenses, audit trails, signed chains of custody. A missing log entry can pause an entire production line.
  3. 99.9% is the floor.Patients have appointment slots booked weeks in advance. Five nines on the manufacturing scheduler isn't a SLO target — it's the cost of entry.
99.9%
System availability achieved — through observability, failover, and a six-month war-gaming of every failure mode we could think of.
7 isotopes
Scheduled simultaneously
0 audit
Findings in 6-month window

My Role

Senior backend engineer on a team of seven. Owned the dose-calculation algorithm, the manufacturing-scheduling service, and most of the observability stack. Co-owned the GitOps deployment pipeline.

Dose CalculationManufacturing SchedulerAudit LoggingObservabilityCI/CDCompliance Review

The approach.

// STEP 01

Make decay a first-class type.

Every dose object carries its isotope, its activity at calibration time, and its calibration timestamp. The scheduler asksdose.activity_at(delivery_time) — never raw numbers. Bug class eliminated.

// STEP 02

Temporal for long-running workflows.

A dose's lifecycle spans 4-48 hours: order → manufacture → QC → dispatch → delivery → administration. Temporal lets us model that as a single deterministic workflow that survives pod restarts without lying about state.

// STEP 03

Istio for the audit trail.

Every service-to-service call passes through Istio with mTLS + Entra ID identity. The mesh-level logs ARE the audit trail — regulators see the same data the on-call engineer sees, with no app-level instrumentation drift.

The outcomes.

99.9%
Uptime, six months

Single 12-minute incident — recovered via Argo rollback in 6 min.

100%
Audit pass

Zero findings on the first external compliance audit of the new system.

7
Isotopes live

F-18, Ga-68, Lu-177, Tc-99m, I-131, In-111, Y-90 — all on day 1.

"The thing I value most about Sai's work is how loudly his services fail. We've never been surprised at 3 AM."
— Engineering lead, Lilly digital health
Next case study →
Real-time Flight Data Pipeline · Southwest