HIPAA-Compliant AI: The Reference Architecture for Healthcare Systems

Q: Is GPT-4 / Claude / Gemini HIPAA-compliant?

The base APIs are not. Azure OpenAI, AWS Bedrock, and Google Vertex AI offer HIPAA-eligible enterprise tiers with a signed BAA when deployed correctly. Anthropic offers HIPAA eligibility on enterprise plans via direct BAA. Configuration, not the model, determines compliance.

Q: Does HIPAA require on-premise AI?

No. HIPAA requires administrative, physical, and technical safeguards over PHI — not a specific deployment topology. Cloud is acceptable when the cloud provider signs a Business Associate Agreement and the customer configures the workload correctly.

Q: What is the most-failed HIPAA control in AI systems?

Audit logging. Specifically: not logging every PHI access, not retaining logs long enough (six years), and not logging at the granularity that lets an investigator answer 'who accessed which patient's record on which date?'

Q: Can RAG systems be HIPAA-compliant?

Yes. The vector store, embeddings, retriever, and generation must all operate inside a BAA-covered environment. Embeddings are derivative of PHI and inherit the same protections. Logging covers retrieval as well as generation.

HIPAA is not a model selection problem. It is a configuration discipline. Any of the major cloud LLM providers can be deployed in a HIPAA-compliant posture, and any of them can be misconfigured into a violation. The reference architecture below is what we deploy when healthcare PHI is in scope and an auditor will eventually visit.

TL;DR — The Five Controls That Matter

BAA chain. Signed BAA from every vendor that handles PHI, end to end. No exceptions, no exemptions.
Encryption. TLS 1.3 in transit, AES-256 at rest, KMS-controlled keys with rotation.
Access control. Least privilege, MFA-required, role-based, with break-glass procedures.
Audit logging. Every PHI access, retained six years, tamper-evident, queryable.
Minimum necessary. AI workflows must request only the PHI fields the task actually needs.

The architecture below operationalizes each of these.

What HIPAA Actually Requires (the Short Version)

HIPAA's relevant rules for AI systems:

Privacy Rule — limits how PHI is used and disclosed. Drives the "minimum necessary" principle.
Security Rule — administrative, physical, and technical safeguards. The technical safeguards (access control, audit controls, integrity, transmission security) map directly onto AI system design.
Breach Notification Rule — disclosure rules if PHI is exposed. Drives logging and monitoring requirements.

A useful framing: HIPAA does not prescribe technology. It prescribes outcomes. Your architecture must produce evidence that PHI is protected; how you produce that evidence is engineering judgment.

The Reference Architecture

                    ┌────────────────────────────────────┐
                    │  Customer-owned AWS / Azure / GCP  │ ← Single tenant, BAA in place
                    │                                    │
                    │  ┌──────────────────────────────┐  │
                    │  │  Edge / API gateway          │  │ ← TLS 1.3, WAF, IP allowlist
                    │  │  (SSO → MFA → JWT)           │  │
                    │  └──────────┬───────────────────┘  │
                    │             │                      │
                    │  ┌──────────▼───────────────────┐  │
                    │  │  PHI Tokenizer / De-id       │  │ ← Optional, depends on workflow
                    │  └──────────┬───────────────────┘  │
                    │             │                      │
                    │  ┌──────────▼───────────────────┐  │
                    │  │  Orchestration (private VPC) │  │ ← LangGraph in private subnet
                    │  └──┬───────────┬───────────────┘  │
                    │     │           │                  │
                    │  ┌──▼──┐  ┌─────▼─────┐            │
                    │  │ RAG │  │   LLM     │            │
                    │  │vec  │  │ provider  │            │ ← BAA-covered: Azure OpenAI,
                    │  │store│  │           │            │   AWS Bedrock, on-prem
                    │  └──┬──┘  └─────┬─────┘            │
                    │     │           │                  │
                    │  ┌──▼───────────▼─────────────┐    │
                    │  │  Audit Log (immutable)     │    │ ← S3 Object Lock / WORM
                    │  │  6-year retention          │    │
                    │  └────────────────────────────┘    │
                    └────────────────────────────────────┘

Everything inside the box is single-tenant infrastructure owned by the customer (or a partner with a BAA). Anything calling out of the box must be BAA-covered.

Control 1 — The BAA Chain

A Business Associate Agreement is the legal instrument that makes a vendor a "business associate" handling PHI on your behalf. Without it, the vendor is not authorised to receive PHI, period.

The chain must be unbroken:

Patient (Covered Entity) → Your Org (BA) → Cloud Provider (BA) → LLM Provider (BA) → Subprocessors (BA)

A missing link is a violation regardless of how well the rest is configured.

What to verify before deploying:

Vendor	Has BAA for AI workloads?	Notes
Azure OpenAI	Yes	Standard Microsoft BAA covers it on enterprise contracts
AWS Bedrock	Yes	AWS BAA must explicitly include Bedrock
Google Vertex AI	Yes	Google Cloud BAA covers it
Anthropic (direct)	Yes, enterprise tier	Requires separate BAA, not on standard plans
OpenAI (direct)	Yes, enterprise tier	API-only; ChatGPT consumer not BAA-eligible
Pinecone	Yes, enterprise tier	Confirm BAA explicitly covers your workload
Self-hosted models	N/A	You are the operator; no third-party data sharing

A common mistake: using a vendor's "HIPAA-eligible" service through the wrong contract type. "Eligible" is not the same as "covered" — you have to sign the BAA explicitly and configure into the eligible subset.

Control 2 — Encryption and Key Management

In transit: TLS 1.3 between every component. mTLS for service-to-service.
At rest: AES-256 on every store — vector DB, audit logs, conversation history, embeddings.
Keys: KMS-managed (AWS KMS, Azure Key Vault, GCP KMS) with documented rotation policy and access audit.
Customer-managed keys (CMKs) are not strictly required by HIPAA but are increasingly expected in enterprise procurement and dramatically simplify breach scoping.

Embeddings are derivative of PHI. They must be encrypted at rest with the same key class as the source documents.

Control 3 — Access Control

The technical safeguards in the Security Rule map onto five concrete controls:

Unique user identification. Service accounts forbidden for human-driven access. Every action attributable to a person.
Automatic logoff. Sessions expire on inactivity. AI workflows must re-authenticate after configurable timeouts.
Encryption and decryption. Keys held in KMS, not in environment variables.
MFA at every entry point. SSO + MFA, no exceptions for "convenience" tools.
Role-based access. A care coordinator can read patient charts; an AI training engineer cannot. The AI orchestrator runs under a service identity with strictly scoped permissions.

Break-glass procedures: for emergencies, a designated role can elevate permissions with mandatory justification, mandatory paired approval, and a logged alert to the security team. Used <0.1% of the time; logged 100% of it.

Control 4 — Audit Logging

The single most-failed HIPAA control in AI deployments.

Every PHI access — generation, retrieval, embedding, display — must be logged with:

Timestamp (to the millisecond, UTC)
Actor identity (user_id, service identity if applicable)
Action (read / write / generate / retrieve)
PHI affected (patient_id, document_id, field-level granularity where relevant)
Outcome (success / failure / denied)
Source context (request_id, trace_id)

Logs go to immutable storage — S3 Object Lock in compliance mode, Azure Immutable Blob Storage, or write-once tape — and retained for six years (HIPAA minimum) or seven (defensible margin).

python
async def log_phi_access(event: PHIAccessEvent):
    record = {
        "timestamp": utcnow().isoformat(timespec="milliseconds"),
        "actor": event.actor.id,
        "actor_type": event.actor.type,  # human / service
        "action": event.action,
        "patient_id": event.patient_id,
        "document_id": event.document_id,
        "fields": event.fields,
        "outcome": event.outcome,
        "trace_id": event.trace_id,
        "ip": event.client_ip,
        "user_agent": event.user_agent,
    }
    record["hash"] = hmac_sign(record, SIGNING_KEY)
    await audit_store.append(record)  # immutable WORM target

The hash-chain pattern (each log entry's hash includes the previous entry's hash) gives you tamper-evidence cheaply. An auditor can verify the chain is intact in seconds.

Control 5 — Minimum Necessary in AI Workflows

The Privacy Rule's "minimum necessary" standard requires that PHI use be limited to what is needed for the task.

In AI architecture this turns into concrete patterns:

Field-level retrieval. A medication-history agent retrieves only medication rows, not full charts.
De-identification before generation when possible. If an LLM call doesn't strictly need names and dates of birth, mask them before sending.
Per-tool authz on PHI fields. A "summarize visit" tool authorizes against read:encounter; not read:billing.
Eval data is also PHI. If your eval set is built from production traces, it is PHI and inherits all protections.

What HIPAA-Compliance Does NOT Require

Several myths cost teams unnecessary engineering effort. HIPAA does not require:

On-premise hosting (cloud + BAA + correct configuration is fine).
US-only data residency (HIPAA itself is silent on location; specific BAAs may impose it).
Disabling logging or storage of conversations (you must log them).
Avoiding cloud LLM providers categorically (you may use them under BAA).
A specific encryption algorithm (AES-256 is current best practice but the Security Rule is technology-neutral).

What it does require: documented decisions, evidence of controls, and the ability to produce that evidence on demand.

The Annual Compliance Calendar

Compliance is not a one-time achievement. The cadence we run with healthcare-AI clients:

Activity	Frequency
Risk analysis (SR 164.308(a)(1))	Annually + after material changes
Access review — who has access to PHI systems	Quarterly
Audit log review — sampled and trend analysis	Monthly
Penetration test (external)	Annually
Workforce HIPAA training	Annually + onboarding
Disaster recovery test	Annually
BAA inventory review	Annually + before adding any new vendor
Incident response tabletop exercise	Annually

Each of these is documented; the documentation is what survives an audit.

Common Audit Findings (and Fixes)

The findings auditors flag most often in AI systems, with fixes:

Finding	Root cause	Fix
Audit logs incomplete or inconsistent	Logging added per-feature, not per-data-access	Centralise logging through a wrapper around every PHI access
No documented risk analysis	Compliance treated as engineering-only	Run the 164.308 risk analysis annually with named owners
Excessive PHI in vector stores	"We embedded the whole record"	Field-level chunking; de-id before embedding when possible
Lapsed BAAs after vendor changes	No process to re-verify	Quarterly BAA inventory; block adds without BAA
AI prompts contain PHI in logs	Logs include raw prompts	Redact or hash PHI in app logs; full record only in audit log

Frequently Asked Questions

Is GPT-4 / Claude / Gemini HIPAA-compliant?

The base APIs are not. Azure OpenAI, AWS Bedrock, and Google Vertex AI offer HIPAA-eligible enterprise tiers with a signed BAA when deployed correctly. Anthropic offers HIPAA eligibility on enterprise plans via direct BAA. Configuration, not the model, determines compliance.

Does HIPAA require on-premise AI?

No. HIPAA requires administrative, physical, and technical safeguards over PHI — not a specific deployment topology.

What is the most-failed HIPAA control in AI systems?

Audit logging. Specifically: not logging every PHI access, not retaining logs long enough (six years), and not logging at the granularity that lets an investigator answer "who accessed which patient's record on which date?"

Can RAG systems be HIPAA-compliant?

Yes. The vector store, embeddings, retriever, and generation must all operate inside a BAA-covered environment. Embeddings are derivative of PHI and inherit the same protections. Logging covers retrieval as well as generation.

How long does it take to certify a HIPAA-aligned architecture?

HIPAA has no certification; vendors self-attest. Achieving "audit-ready" posture on a new system typically takes 8-12 weeks of focused work plus a six-month operational track record before the first external assessment.

Key Takeaways

HIPAA compliance is configuration, not architecture style — cloud and on-prem both work when configured correctly.
Embeddings and retrieval logs are PHI; treat them with the same controls as the source records.
Audit logging at request granularity, retained for six years, is the most-failed control in healthcare AI.
The BAA chain must be unbroken from your customer down to every provider that touches PHI.
Compliance is a continuous practice, not a one-time achievement — the annual cadence is the work.

Frequently Asked Questions

Is GPT-4 / Claude / Gemini HIPAA-compliant?

Does HIPAA require on-premise AI?

No. HIPAA requires administrative, physical, and technical safeguards over PHI — not a specific deployment topology. Cloud is acceptable when the cloud provider signs a Business Associate Agreement and the customer configures the workload correctly.

What is the most-failed HIPAA control in AI systems?

Audit logging. Specifically: not logging every PHI access, not retaining logs long enough (six years), and not logging at the granularity that lets an investigator answer 'who accessed which patient's record on which date?'

HIPAA-Compliant AI: The Reference Architecture for Healthcare Systems

TL;DR — The Five Controls That Matter

What HIPAA Actually Requires (the Short Version)

The Reference Architecture

Control 1 — The BAA Chain

Control 2 — Encryption and Key Management

Control 3 — Access Control

Control 4 — Audit Logging

Control 5 — Minimum Necessary in AI Workflows

What HIPAA-Compliance Does NOT Require

The Annual Compliance Calendar

Common Audit Findings (and Fixes)

Frequently Asked Questions

Is GPT-4 / Claude / Gemini HIPAA-compliant?

Does HIPAA require on-premise AI?

What is the most-failed HIPAA control in AI systems?

Can RAG systems be HIPAA-compliant?

How long does it take to certify a HIPAA-aligned architecture?

Key Takeaways

Is GPT-4 / Claude / Gemini HIPAA-compliant?

Does HIPAA require on-premise AI?

What is the most-failed HIPAA control in AI systems?

Can RAG systems be HIPAA-compliant?

Let's Build Your
Sovereign System

SyntharaTechnologies

Services

Directives

Direct Communication

INITIATE
PROTOCOL.

HIPAA-Compliant AI: The Reference Architecture for Healthcare Systems

TL;DR — The Five Controls That Matter

What HIPAA Actually Requires (the Short Version)

The Reference Architecture

Control 1 — The BAA Chain

Control 2 — Encryption and Key Management

Control 3 — Access Control

Control 4 — Audit Logging

Control 5 — Minimum Necessary in AI Workflows

What HIPAA-Compliance Does NOT Require

The Annual Compliance Calendar

Common Audit Findings (and Fixes)

Frequently Asked Questions

Is GPT-4 / Claude / Gemini HIPAA-compliant?

Does HIPAA require on-premise AI?

What is the most-failed HIPAA control in AI systems?

Can RAG systems be HIPAA-compliant?

How long does it take to certify a HIPAA-aligned architecture?

Key Takeaways

Is GPT-4 / Claude / Gemini HIPAA-compliant?

Does HIPAA require on-premise AI?

What is the most-failed HIPAA control in AI systems?

Can RAG systems be HIPAA-compliant?

Let's Build Your Sovereign System

SyntharaTechnologies

Services

Directives

Direct Communication

INITIATE PROTOCOL.

Let's Build Your
Sovereign System

INITIATE
PROTOCOL.