Strips 17 categories of patient identifiers locally, before they reach any AI tool — including Medicare, IHI, DVA, and facility names that no US-built tool catches. Every session produces a cryptographic audit record.
Clinicians paste PHI into AI tools every day — not from negligence, from invisibility. CleanRoom intercepts at the paste event, strips identifiers on-device, and emits a tamper-evident audit record. The clinician's workflow is unchanged. The compliance posture is not.
Each entry commits to the hash of the entry before it. Alter or delete a single line and the chain no longer verifies — tamper-evident by construction, not by policy promise.
{
"session_id": "sess_5f3a9c2e",
"timestamp": "2026-05-16T09:42:17+10:00",
"version": "1.2.5",
"model_destination": "openai.com/v1/chat/completions",
"entities_detected": {
"PERSON": 3, "MEDICARE": 1, "DATE_OF_BIRTH": 1,
"ADDRESS": 1, "FACILITY_NAME": 1, "CLINICAL_DATE": 2
},
"sensitivity_classes_flagged": ["suicidality"],
"phi_bytes_blocked": 247,
"tokenised_payload_hash": "8f4a...c2d1",
"chain_valid": true,
"export_sha256": "b3e9...9a47"
}
No PHI in the audit record itself. The record proves what was blocked, never re-exposes what it was.
Identifier coverage is reported as recall and code-verified false positives on a held-out corpus. Quasi-identifier expansion is research-grade work, reported on its own track and never blended into a single headline figure.
Earlier v1.0.x baseline: 100% recall at 99.54% F1 on a 650-instance external holdout. Historical; not re-measured against current build.
41 categorical FP docs disclosed in scorecard.json. Calibrated precision deferred to research track. Layer 3 ONNX NER omitted from node harness — live extension counts are higher.
All numeric claims source-traceable to docs/v1.2.3/corpus_run/scorecard.json (current detection baseline; v1.2.5 detection identical by construction). Reproduction available on request.
Australia's privacy regime is principles-based. APP 11's "reasonable steps" test is what your compliance team has to defend. CleanRoom is the technical artefact that defends it.
Every entity in the taxonomy below was added because it appeared in a real clinical note — not because it was on a HIPAA-derived checklist. Writing comprehensive geriatric assessments at 9 PM teaches you what re-identifies a patient: the facility name, the occupation that narrows a cohort to one, the third-party role that breaches a son or daughter's privacy as much as the patient's own. The codebase is not the moat. The clinical writing that shaped it is.
The long tail of Australian healthcare — GP clinics, RACFs, allied health, specialist practices — will never deploy private LLMs. They need the layer that makes commodity AI safe to use under Australian law. The taxonomy below is what that long tail actually has to deal with.
Plus 8 sensitivity classes — suicidality, self-harm, abuse-disclosure, custody, forensic-status, threat-to-third-party, substance-use, involuntary-treatment — detected, flagged in the audit record, and surfaced as a non-blocking warning to the clinician.
This cannot be built anywhere else. By anyone else.
On the device. CleanRoom's processing layer runs locally — in the browser, in the application, or in a sidecar process inside your environment. Identifying patient information is not transmitted to any external service for de-identification. That is the architectural guarantee.
Standard application logs are mutable, contextual, and not designed as evidentiary artefacts. The Sentinel Record is a per-session, SHA-256 hash-chained audit trail: entity counts, timestamps, model destinations, and session integrity in a form that can be verified after the fact. It is built to satisfy the "reasonable steps" evidentiary standard, not to satisfy a developer debugging a bug.
Recall is the metric we optimise hardest, because false negatives are the compliance risk. On the current v1.2.3 corpus run — 50-document AU mental-health corpus, current detection baseline — patient-alias recall is 100% (236/236) and 24 historical false positives have been cross-validated to 0 live. An earlier v1.0.x baseline reported 100% recall at 99.54% F1 on a separate 650-instance external holdout; that figure is historical and not re-measured. Residual real-world gaps — primarily non-Anglo names in unstructured prose — are treated as a known watch-area and prioritised every release. We will not market a number we cannot reproduce on request.
Quasi-identifiers are different from direct identifiers, and we don't claim to neutralise them. Combinations of attributes — age, language, postcode, condition, facility — can identify someone even when names are stripped. A 67-year-old Vietnamese-speaking woman with severe COPD in a 60-bed regional Victorian RACF is identifiable from those attributes alone.
Direct identifiers are stripped on-device. For quasi-identifiers, v1.2.5 surfaces a categorical QI exposure band — low, moderate, elevated, high — over the QI types present, escalating on small-cohort signals (named facility, over-89 age, forensic status). It does not silently rewrite text; it flags residual exposure to the clinician.
This is a heuristic, not a calibrated re-identification probability. A calibrated probability — and any automated generalisation acting on it — is research-grade work, pursued separately. The shift v1.2.5 delivers: "all PHI in the cloud" → "direct identifiers stripped, residual exposure disclosed in the audit record."
If you're responsible for clinical AI risk in an Australian healthcare organisation, an indemnity insurer, or a digital health vendor — book the call.
The goal in 18 months: every Australian clinician using AI does so with the same default safety layer — invisibly, by architecture. The OAIC has a standard to point at. The MDOs have evidence to underwrite.