Privacy infrastructure for clinical AI

Stop patient data from leaving your system before it reaches AI.

Strips PHI on-device before any AI sees it. Generates a cryptographic audit record per session. Built for the Privacy Act.

See it run ↓ Book a 20-minute call →

Built by Dr Vincent Siaw
MBBS, FRACP, MD (Periop), GChPOM · Consultant Geriatrician · Think Parallax Pty Ltd · Melbourne

The breach you remember

0M records.

Medibank exposed nearly 10 million Australians' health data. The statutory tort for serious privacy invasion has been live since December 2024. Civil penalties now reach A$50 million per contravention. The next paste of patient data into an AI tool is a board-level event.

The deadline you don't

Dec 10 2026.

—

Days

—

Hrs

—

Min

—

Sec

The Automated Decision-Making Framework obligations under the Privacy Act become enforceable. Combined with APP 3, 8, and 11, every clinical AI workflow will need an evidentiary trail proving what happened to patient data — not a policy promising it won't happen.

The gap

Your privacy policy doesn't survive contact with an AI scribe.

Clinicians paste names, MRNs and Medicare numbers into AI tools every day — not from negligence, from invisibility. The data leaves the device the moment "send" is clicked. There is no artefact to show what was sent.

▼ Today's workflow

Paste → cloud → hope.

Clinician opens AI tool

↓

PHI transmitted to cloud model

↓

De-identification at vendor's end (contractual)

↓

No evidentiary record of what left the device

BAAs solve a paperwork problem, not an evidentiary one. "We have a vendor agreement" is not an answer to OAIC.

▲ With CleanRoom

Strip locally. Prove it.

Clinician opens AI tool

↓

CleanRoom intercepts on-device

↓

28 PHI categories stripped before transmit

↓

Sentinel Record hash-chained per session

APP 11 requires reasonable steps. Architectural de-identification is the strongest available technical control. The Sentinel Record is the artefact.

Live demo

Paste in. Stripped out. Audited.

Synthetic data · No PHI processed · Runs in your browser, no server calls

cleanroom · session simulator

Clinician input Contains PHI

What the AI vendor sees Awaiting strip

Ready

Sentinel Record

sess_———————

Timestamp

—

Model dst.

— · awaiting

SHA-256

—

Entities 0 / 0 categories

Synthetic data only · No PHI processed · Demonstration of v2.1 detection layer

How CleanRoom works

One layer between the clinician and the model. That's all it takes.

Intercept at the paste event

Browser extension, SDK, or sidecar proxy. Recognises clinical text the moment it tries to leave the environment. Nothing identifiable transmits.

ii.

Strip 28 PHI categories locally

Names, MRNs, Medicare, addresses, NDIS, dates, hospital URNs, pathology numbers, ACFI codes, NOK, interpreter, witness. Replaced with structure-preserving tokens. All on-device.

iii.

Generate the Sentinel Record

SHA-256 hash-chained per session: timestamps, entity counts, model destination, session integrity. The evidentiary record OAIC and your compliance team actually require.

Validated

Tested against an Australian out-of-distribution corpus. Not a US benchmark.

Held-out synthetic Australian clinical text — geriatric, RACF, primary-care vocabulary. Every release ships the numbers.

Entity types

PHI categories detected — including AU-specific identifiers (Medicare, NDIS, IHI, URN, ACFI).

F1 score

Combined precision and recall across all 28 entity types on v2.1 evaluation corpus.

Recall

Proportion of identifiers caught. Recall is what matters for compliance — false negatives are the risk.

Deterministic layer

Recall and precision on structured Australian identifiers (Medicare, IHI, MRN, NDIS, dates).

CleanRoom v2.1.0 · Methodology available on request

Regulatory coverage

Built around the Privacy Act. Not retrofitted from HIPAA.

Australia's privacy regime is principles-based. APP 11's "reasonable steps" test is what your compliance team has to defend. CleanRoom is the technical artefact that defends it. — click any to expand —

APP 3

Collection minimisation

Only de-identified text reaches the model. Identifying data is never collected by the AI vendor.

APP 8

Cross-border disclosure

PHI never leaves Australian jurisdiction. The model receives stripped text only.

APP 11

Reasonable steps

Architectural de-identification is the strongest available technical control. The Sentinel Record proves it.

ADM

Dec 2026 framework

Per-session audit artefacts satisfy the accountability and explainability obligations for automated decision support.

Sovereignty

Two kinds of sovereignty. Both non-negotiable.

Healthcare AI built for Australia answers to Australian law and to Australian clinicians. Both are non-negotiable. CleanRoom is built for both.

Pillar I

Australian data sovereignty

PHI never leaves Australian jurisdiction. Not in flight. Not at rest. Not even briefly in someone else's logs.

Stripped on-device, before any transmit — architectural, not contractual
No data crosses borders, ever. Per APP 8, by design
The Sentinel Record stays with the practice, under their control
Built around APP 3, 8, 11 — not retrofitted from HIPAA
Australian-hosted infrastructure for any optional services

Pillar II

Clinician sovereignty

Built by a clinician, for clinicians. Not engineers guessing at workflows. Not foreign vendors imposing US frameworks on Australian practice.

We share the taxonomy: NOK, ACFI, URN, IHI, MRN, RACF, IRP, MDT
We share the time pressure: the 9 PM ward round, the Friday discharge, the Monday case conference
We share the consequence: a missed identifier is a notifiable breach; a missed diagnosis is a coronial inquiry
Efficiency and effectiveness — without compromising quality and safety. The preconditions, not the trade-offs

"From one clinician to another — built by someone who carries the same pen, the same time pressure, and the same duty of care to the patient."

Why us

Built by a clinician who actually writes the notes being pasted.

Most healthcare privacy tools are built by engineers who have never written a discharge summary, a CGA, or a referral letter at 9 PM after a long ward round. CleanRoom started in geriatric practice. Every entity type, every false-negative case, every workflow assumption was shaped by clinical reality.

Sub-tertiary healthcare — GP clinics, RACFs, allied health, specialist practices — will never deploy private LLMs. They need the layer that makes commodity AI safe to use under Australian law. The taxonomy below is what shows up in the notes Australian clinicians actually write.

Medicare IHI NDIS URN ACFI RACF name PERSON DOB ADDRESS POSTCODE PHONE EMAIL PATHOLOGY ID RADIOLOGY ID NOK INTERPRETER WITNESS +11 more

Founder

Dr Vincent Siaw

MBBS · FRACP · MD (Periop) · GChPOM

Consultant Geriatrician

Clinician-builder, dual legitimacy
Specialist in geriatric medicine
Founder, Think Parallax Pty Ltd
Author, Australian PHI taxonomy v2
Based in Melbourne, Victoria

Frequently asked

The questions compliance teams ask first.

Where does the de-identification happen?

On the device. CleanRoom's processing layer runs locally — in the browser, in the application, or in a sidecar process inside your environment. Identifying patient information is not transmitted to any external service for de-identification. That is the architectural guarantee.

How is the Sentinel Record different from a standard log?

Standard application logs are mutable, contextual, and not designed as evidentiary artefacts. The Sentinel Record is a per-session, hash-chained audit trail using SHA-256: it records entity counts, timestamps, model destinations, and session integrity in a form that can be verified after the fact. It is built to satisfy the "reasonable steps" evidentiary standard, not to satisfy a developer debugging a bug.

Do you replace the AI vendor we already use?

No. CleanRoom is the layer between your clinician and your AI vendor — whichever vendor that is. Heidi, GPT, Gemini, Claude, ambient scribes, structured data extractors. CleanRoom does not compete with them. It makes them defensible under the Privacy Act.

What about false negatives — names you miss?

Recall is the metric we optimise hardest, because false negatives are the compliance risk. The current evaluation corpus shows 92.5% recall across all 28 entity types and 100% on structured Australian identifiers. Known gaps — primarily non-Anglo names in unstructured prose — are tracked publicly in the evaluation summary and prioritised in every release. We will not market a number we cannot reproduce on request.

What about quasi-identifiers — combinations that re-identify even after PHI is stripped?

Honestly: this is the hardest unsolved problem in clinical de-identification, and we treat it as one. Direct identifiers (name, MRN, Medicare, address) are not the whole story. Combinations of attributes — age, language, postcode, condition, facility — can still uniquely identify an individual in small populations. A 67-year-old Vietnamese-speaking woman with severe COPD in a 60-bed RACF in regional Victoria is identifiable from those attributes alone, even without her name.

CleanRoom approaches this through a Bayesian risk framework rather than a fixed k-anonymity threshold. A prior on re-identification probability is derived from population context — large urban hospital is low-prior, small rural RACF is high-prior. Each observed quasi-identifier (age band, sex, language, postcode, condition class, facility class) updates the posterior. When the posterior crosses a configurable threshold, CleanRoom flags the session for additional generalisation: postcodes collapse to LGAs, ages to bands, languages to language families, conditions to ICD chapters.

This is intentionally a probabilistic risk system, not a guarantee. In a 60-bed RACF, perfect de-identification is a research-grade problem and we are working with Australian academic collaborators on it. What CleanRoom does is shift the risk surface from "all PHI in the cloud" to "residual probabilistic risk on stripped output" — the difference between a structural compliance failure and a manageable, auditable clinical-statistical decision.

Who is this not for?

Public hospital tertiary networks running fully internal large-language-model deployments behind IRAP-certified infrastructure. If you have your own private LLM hosted in your own data centre, you have already solved the disclosure problem at the network layer. CleanRoom is built for the long tail — GP clinics, RACFs, specialist practices, allied health, regional services — that will use commodity AI and need the layer that makes commodity AI safe.

Is the OAIC certifying products like this?

The OAIC does not certify products. It issues guidance and enforces principles. CleanRoom's strategy is to make architectural de-identification the recognised reasonable-steps standard for clinical AI — through OAIC consultation submissions, the APP Code mechanism, and engagement with indemnity insurers. The goal is not certification. It is becoming the architecture compliance is written around.

Next step

Twenty minutes. One conversation.

If you're responsible for clinical AI risk in an Australian healthcare organisation, an indemnity insurer, or a digital health vendor — book the call.

Book a 20-minute call → Request the technical brief →

No pitch deck. No spam. Just the conversation.