Privacy infrastructure for Australian clinical AI

Stop patient data from leaving your system before it reaches AI.

CleanRoom strips identifiable patient information locally — on the device, in Australia — before any AI model sees it. Every session generates a cryptographic audit record your compliance team can actually use.

Designed in Australia Data stays local No US cloud dependency Privacy Act first, not HIPAA
The breach you remember
9.7M records.

The Medibank breach exposed nearly 10 million Australians' health data. The statutory tort for serious privacy invasion has been live since December 2024. Civil penalties now reach A$50 million per contravention. The next paste of patient data into an AI tool is a board-level event.

The deadline you don't
Dec 10 2026.

The Automated Decision-Making Framework obligations under the Privacy Act become enforceable. Combined with APP 1, 3, 8 and 11, every clinical AI workflow will need an evidentiary trail proving what happened to patient data — not a policy promising it won't happen.

Your privacy policy doesn't survive contact with an AI scribe.

Clinicians paste patient notes, names, MRNs and Medicare numbers into AI tools every day. Not from negligence — from invisibility. The data leaves the device the moment "send" is clicked, and there is no artefact to show what was sent, where it went, or whether it was retained.

BAAs and contractual assurances solve a paperwork problem. They do not solve an evidentiary one. When the OAIC asks what specific patient information was disclosed in a specific session, "we have a vendor agreement" is not an answer. APP 11 requires reasonable steps. Reasonable steps are now technical, not contractual.

Most healthcare AI products — including the Australian-built ones — transmit identifiable data to a cloud model and rely on contractual de-identification at the other end. That model is structurally incompatible with APP 8 cross-border disclosure rules and the ADM Framework's accountability obligations. It is also the model the regulator is now writing guidance against.

What's missing is not another AI product. What's missing is the layer that makes every other AI product safe to use under Australian law.

One layer between the clinician and the model. That's all it takes.

CleanRoom sits at the moment of disclosure — the paste, the send, the API call. It detects identifying information, strips it locally, and proves what it did.

i.
Intercept at the paste event
CleanRoom recognises when clinical text is about to leave the clinician's environment — through a browser extension, an SDK, or a sidecar proxy. Nothing identifiable transmits.
ii.
Strip 28 PHI categories locally
Names, MRNs, Medicare numbers, addresses, NDIS identifiers, dates, hospital URNs, pathology numbers, next-of-kin, interpreter and witness fields — replaced with structure-preserving tokens. All on-device. Nothing leaves Australia.
iii.
Generate the Sentinel Record
Each session produces a SHA-256 hash-chained audit artefact: timestamps, entity counts, model destination, session integrity. The evidentiary record OAIC and your compliance team actually require.

Identified. De-identified. Sent. Returned. Re-identified. Audited.

The full CleanRoom round-trip in fifteen seconds — clinical text in, tokenised before it leaves the device, processed by the AI without ever seeing PHI, returned, re-identified locally for the clinician, and recorded in the Sentinel audit trail.

cleanroom · local session
1 · Identified/local
Pt: John Smith DOB: 14/03/1948 Med: 2956 12345 6 RACF: Sunset Manor Dr: L. Chen Note: recurrent falls, cognitive change
2 · Tokenised/local
Pt: [PERSON_01] DOB: [DATE_01] Med: [MED_01] RACF: [FAC_01] Dr: [PERSON_02] Note: recurrent falls, cognitive change
3 · AI↑ cloud
→ model.api tokens only no PHI sent ← response: "Suggest review for [PERSON_01]: - meds review - falls clinic"
4 · Re-identified/local
For John Smith (DOB 14/03/48, Sunset Manor): - meds review - falls clinic - liaise with Dr L. Chen
5 · Sentinel/audit
session: a3f2…c891 out: 6 stripped in: 6 restored hash chain: ✓ verified jurisdiction: AU only
15-second loop · full round-trip · synthetic data only · animated illustration

Tested in the language clinicians and biostatisticians already speak.

CleanRoom is a binary classifier at the token level — every word is either flagged as PHI or not. The right metrics are the same ones you use to evaluate a screening test.

92.5%
v2.1 corpus
Recall (sensitivity)
Of all the PHI tokens actually present in clinical text, the proportion correctly detected. The miss is a false negative — an identifier that escaped the strip. This is the metric that maps directly to compliance risk: every false negative is a potential APP 11 breach.
TP / (TP + FN) — equivalent to sensitivity
89.7%
v2.1 corpus
Precision (positive predictive value)
Of all tokens flagged as PHI, the proportion that actually were. False positives over-redact non-PHI words — they degrade the model's downstream output but do not breach privacy. We tune the system conservatively because under-redaction is worse than over-redaction.
TP / (TP + FP) — equivalent to PPV
91.1%
v2.1 corpus
F1 score (harmonic mean of P and R)
The single-number summary of detection performance — the harmonic mean of precision and recall. Penalises imbalance: a system with 100% recall but 1% precision scores poorly, as does the reverse. F1 is the closest thing to a comparable benchmark across PHI detectors and the standard metric in NLP-for-healthcare literature.
2 · (P · R) / (P + R)
100%
structured AU IDs
Deterministic layer (rule-based detection)
For structured Australian identifiers — Medicare, IHI, NDIS numbers, hospital URNs, dates — CleanRoom uses validated pattern matching with checksum verification. Sensitivity and specificity both 100% by construction, because the patterns are mathematically determined, not learned. NPV and PPV are also unity in this layer.
FN = 0, FP = 0 by checksum validation
CleanRoom v2.1.0 · 28 entity types · evaluation methodology available on request

Built around the Privacy Act. Not retrofitted from HIPAA.

Australia's privacy regime is principles-based. The "reasonable steps" tests in APPs 1 and 11 are what your compliance team has to defend. CleanRoom is the technical artefact that defends them.

APP 1.7–1.9
Open and transparent management
Reasonable steps to implement practices, procedures and systems for compliance — and to handle privacy inquiries and complaints. CleanRoom is the system; the Sentinel Record is the artefact.
APP 3
Collection minimisation
Only de-identified text reaches the model. Identifying data is never collected by the AI vendor in the first place.
APP 8
Cross-border disclosure
PHI never leaves Australian jurisdiction. The model — wherever it is hosted — receives stripped text only.
APP 11
Reasonable steps (security)
Architectural de-identification is the strongest available technical control. The Sentinel Record proves it was applied to a specific session.
ADM
Dec 10, 2026 framework
Per-session audit artefacts satisfy the accountability and explainability obligations for automated decision support.
Tort
Statutory privacy invasion
Active since December 2024. Up to A$50M civil penalty per contravention. Architectural controls are now the cost of operation, not the upgrade.

Built in Australia, for Australian healthcare, by people who actually handle the data.

Most healthcare privacy tools are built by engineers who have never written a clinical note, lodged a referral, taken a verbal handover, processed a claim, or filed a care plan at the end of a long shift. They build for a sanitised idea of how healthcare works — not the Friday-afternoon reality of a busy clinic, ward, pharmacy, or community service.

CleanRoom started inside the system, not outside it. Every entity type, every false-negative case, every workflow assumption was shaped by frontline reality — not abstract privacy theory. The taxonomy includes Medicare and IHI numbers, RACF identifiers, NDIS numbers, ACFI codes, hospital URNs, pathology accession numbers, allied health referral details, interpreter and witness fields, next-of-kin entries — because those are the identifiers that appear in the notes, letters, plans, scripts, claims and care records that Australian health professionals actually produce, every day.

Sovereignty is not a marketing line. CleanRoom processes patient data on the user's device. There is no central server in Australia, and no shadow server in the United States. The architecture makes it impossible for PHI to traverse a foreign jurisdiction during the de-identification step. That is what APP 8 actually requires — and what BAAs cannot deliver.

The market CleanRoom serves is the long tail of Australian healthcare: general practice, specialist clinics, allied health, pharmacy, nursing services, residential aged care, community health, disability services, dental, mental health, private practice — and the back-office functions (practice managers, billing, intake, referral coordinators) that handle PHI without making the headlines. None of these settings will ever deploy a private LLM in their own data centre. They will use commodity AI tools, and they need the layer that makes those tools safe under Australian law.

Founder
Dr Vincent Siaw
MBBS · FRACP · MD (Periop) · GChPOM
Consultant Geriatrician
  • Clinician-builder, dual legitimacy
  • Specialist in geriatric medicine
  • Author, Australian PHI taxonomy v2
  • Designer, Sentinel audit architecture
  • Based in Melbourne, Victoria

The questions compliance teams ask first.

Where does the de-identification happen?

On the device. CleanRoom's processing layer runs locally — in the browser, in the application, or in a sidecar process inside your environment. Identifying patient information is not transmitted to any external service for de-identification. PHI does not leave Australian jurisdiction at any point in the strip. That is the architectural guarantee.

How is the Sentinel Record different from a standard log?

Standard application logs are mutable, contextual, and not designed as evidentiary artefacts. The Sentinel Record is a per-session, hash-chained audit trail using SHA-256: it records entity counts, timestamps, model destinations, and session integrity in a form that can be verified after the fact. It is built to satisfy the "reasonable steps" evidentiary standard under APPs 1.7–1.9 and 11, not to satisfy a developer debugging a bug.

Do you replace the AI vendor we already use?

No. CleanRoom is the layer between your clinician and your AI vendor — whichever vendor that is. Heidi, GPT, Gemini, Claude, ambient scribes, structured data extractors. CleanRoom does not compete with them. It makes them defensible under the Privacy Act.

What about false negatives — names you miss?

Recall (sensitivity) is the metric we optimise hardest because false negatives are the compliance risk. The current evaluation corpus shows 92.5% recall across all 28 entity types and 100% sensitivity on structured Australian identifiers. Known gaps — primarily non-Anglo names in unstructured prose — are tracked publicly in the evaluation summary and prioritised in every release. We will not market a number we cannot reproduce on request.

Who is this not for?

Public hospital tertiary networks running fully internal large-language-model deployments behind IRAP-certified infrastructure. If you have your own private LLM hosted in your own data centre, you have already solved the disclosure problem at the network layer. CleanRoom is built for the long tail — GP clinics, RACFs, specialist practices, allied health, regional services — that will use commodity AI and need the layer that makes commodity AI safe.

Is the OAIC certifying products like this?

The OAIC does not certify products. It issues guidance and enforces principles. CleanRoom's strategy is to make architectural de-identification the recognised reasonable-steps standard for clinical AI — through OAIC consultation submissions, the APP Code mechanism, and engagement with indemnity insurers. The goal is not certification. It is becoming the architecture compliance is written around.

Twenty minutes. One conversation.

If you're responsible for clinical AI risk in an Australian healthcare organisation, an indemnity insurer, or a digital health vendor — book the call. The fastest way to find out whether CleanRoom is relevant is a 20-minute conversation with the founder.

No pitch deck. No spam. Just the conversation.