HIPAA-Compliant AI Document Processing: BAA Buyer's Guide (2026)

← Back to Blog

Healthcare AI

HIPAA-Compliant AI Document Processing: BAA Buyer's Guide (2026)

The moment a patient record, claim, or clinical note is processed by an AI document platform, the vendor is typically acting as a HIPAA Business Associate, and a Business Associate Agreement (BAA) should be in place before any PHI is shared. In practice, "we don't store it" or "it only touched memory" doesn't remove HIPAA obligations when a service is creating, receiving, maintaining, or transmitting PHI on your behalf.

This guide is a buyer-focused checklist for HIPAA-oriented AI document processing: what a usable BAA must include, how SaaS vs private cloud vs on-prem changes your risk posture, which subprocessors in the stack need BAAs, and what a credible vendor compliance packet looks like. (This is practical guidance, not legal advice. Have counsel review your BAA and deployment plan.)

When Your AI Vendor Becomes a Business Associate

Under HIPAA (45 CFR 160.103), a Business Associate is any person or entity that creates, receives, maintains, or transmits Protected Health Information on behalf of a Covered Entity. For AI document processing, this means:

  • Uploading a document containing PHI to an AI tool makes the vendor a Business Associate, even if the tool is marketed as "general purpose"
  • Transient processing counts. HHS guidance on cloud services explains that providers can still be Business Associates even when ePHI is encrypted and the vendor lacks decryption keys ("no-view" / "no access" designs). If your workflow sends ePHI to a vendor for processing, that vendor is typically a Business Associate.
  • Embeddings and vectors derived from PHI are still PHI if they can be used to identify an individual or are stored alongside identifiable data
  • Logs and telemetry that capture document contents, query text, or error messages containing PHI also fall under the regulation

There is one narrow exception: a service acting purely as a conduit for ePHI transmission (similar to the postal service or an ISP), with only temporary storage incident to transmission, may not be a Business Associate. However, any AI service that processes, analyzes, or stores PHI beyond mere transmission does not qualify for this exception. In most AI document workflows, the vendor is a BA.

The legal trigger is the PHI, not the contract. HIPAA generally requires a BAA before a covered entity discloses PHI to a Business Associate (45 CFR 164.502(e)). Sharing PHI with a vendor that has not signed a BAA is itself a HIPAA violation, regardless of the vendor's security posture.

Common misconception: "HIPAA certified" is not a real designation. There is no certification body, no government seal, and no audit that confers "HIPAA certified" status. HIPAA compliance is self-attested and demonstrated through documented safeguards, policies, and a willingness to sign a BAA. If a vendor leads with "HIPAA certified," ask what they actually mean. Look for SOC 2 Type II reports, penetration test summaries, and a BAA they are willing to sign on your timeline.

Three Deployment Models for HIPAA-Compliant Document AI

Buyers evaluating AI document tools for PHI typically choose from three deployment models. Each comes with different trade-offs for control, cost, and compliance burden. The right choice depends on your organization's risk tolerance, IT capacity, and regulatory requirements beyond HIPAA (state privacy laws, ITAR, FedRAMP, etc.).

Model 1: SaaS with BAA

How it works: You use the vendor's cloud-hosted platform. PHI is transmitted to the vendor's infrastructure, processed, and results are returned. The vendor signs a BAA covering their handling of PHI.

How PHI flows: PHI leaves your network, transits encrypted (TLS 1.2+) to the vendor's cloud, is processed on shared or dedicated infrastructure, and results are returned. Documents may be stored temporarily or persistently depending on the vendor's architecture.

  • Encryption keys: Typically managed by the vendor. Some vendors offer customer-managed keys (CMK/BYOK) as an upgrade.
  • Breach notification: A Business Associate must notify the Covered Entity without unreasonable delay and no later than 60 days after discovery of a breach of unsecured PHI. Many BAAs specify shorter windows. You, as the Covered Entity, are then responsible for notifying affected individuals and HHS.
  • BAA provisions needed: Standard BAA covering permitted uses, safeguards, breach notification, subprocessor disclosure, and data destruction.
  • Best fit: Organizations that want fast deployment, minimal IT overhead, and are comfortable with a well-documented cloud vendor. Most healthcare organizations start here.

Ask the vendor: "Is PHI processed on shared infrastructure or isolated per tenant? What is the data retention policy? Can I get zero-retention processing where documents are analyzed in memory only?"

Model 2: Private Cloud / Dedicated Environment

How it works: The vendor deploys a dedicated instance of their platform in a private cloud environment, either within the vendor's cloud account (single-tenant) or within your own cloud account (VPC deployment). PHI stays within a defined network boundary.

How PHI flows: PHI stays within your VPC or a dedicated cloud environment. If deployed in your cloud account, PHI never crosses network boundaries. If deployed in the vendor's cloud as a single-tenant instance, PHI is isolated from other customers but still resides on the vendor's infrastructure.

  • Encryption keys: In VPC deployments, you control the keys. In vendor-hosted single-tenant, keys may be shared or customer-managed depending on the arrangement.
  • Breach notification: Same HIPAA obligations apply. If deployed in your VPC, you may have additional visibility into breach indicators through your own monitoring. The vendor is still a Business Associate if they have any access (e.g., for maintenance, updates).
  • BAA provisions needed: Standard BAA plus provisions for access controls during maintenance, deployment update procedures, and network isolation.
  • Best fit: Mid-market and enterprise organizations with existing cloud infrastructure, IT teams capable of managing VPC deployments, and requirements that go beyond standard SaaS (data residency, network isolation, regulatory overlap with ITAR or FedRAMP).

Ask the vendor: "Is this truly single-tenant, or is it a logical isolation on shared infrastructure? Who has SSH/admin access to the environment? How are updates deployed, and do your engineers need access to production data?"

Model 3: On-Premises / Air-Gapped

How it works: The software runs entirely on your own hardware. No data leaves your physical or virtual perimeter. LLM inference, document parsing, and vector storage all run locally.

How PHI flows: PHI never leaves your premises. All processing, storage, and inference happen on your hardware. In a fully air-gapped deployment, there is no internet connectivity at all.

  • Encryption keys: You control everything: keys, hardware, network, physical access.
  • Breach notification: If the software vendor has no access to PHI (no remote telemetry, no support tunnels, no cloud callbacks), they may not qualify as a Business Associate for the on-prem deployment. However, if the vendor has any access for support or updates, a BAA is still required. Consult your legal counsel on this point.
  • BAA provisions needed: May be minimal or unnecessary if the vendor has zero access to PHI. If the vendor provides support that involves accessing the system, a BAA covering that access is required.
  • Best fit: Defense, intelligence, large health systems with strict data sovereignty requirements, and organizations with existing on-prem infrastructure and the IT staff to maintain it. Cost is highest; control is maximum.

Ask the vendor: "Does the software phone home for licensing, telemetry, or updates? What hardware specs are required to run inference locally? Which LLM models are supported for on-prem deployment, and what is the accuracy trade-off versus cloud models?"

Choosing a Model: Decision Framework

  • Speed to deploy: SaaS (days) > Private Cloud (weeks) > On-Prem (months)
  • Ongoing IT burden: SaaS (minimal) < Private Cloud (moderate) < On-Prem (significant)
  • Data control: On-Prem (maximum) > Private Cloud (high) > SaaS (vendor-dependent)
  • Cost: SaaS (lowest entry) < Private Cloud (mid) < On-Prem (highest, especially with GPU hardware)
  • Model quality: SaaS (latest models, always updated) > Private Cloud (latest, but update cycles may lag) > On-Prem (limited to models that run on your hardware; open-source models like Llama 3 or Mistral are capable but may trail proprietary models on specialized tasks)

Most organizations processing PHI with AI start with SaaS + BAA and evaluate private cloud or on-prem as volume, regulatory requirements, or risk posture change. For a deeper look at private AI deployment for enterprise, see our analysis of the inflection point for self-hosted models. Organizations considering fully disconnected deployments should also review our complete guide to air-gapped AI deployment, which covers compliance mapping, architecture, and vendor evaluation for ITAR, CMMC, and other frameworks beyond HIPAA. For a broader comparison of self-hosted document AI platforms, including stack components, hardware requirements, and vendor red flags, see our self-hosted document AI buyer's guide.

If You Only Read One Thing: The BAA + Stack Checklist

Before you go deeper, here is the condensed checklist. Use this as your scorecard when evaluating any AI document vendor for PHI workloads.

  1. Will they sign a BAA today? Not "soon." Not "on enterprise only." Today, on mutually agreeable terms.
  2. Can they produce a current subprocessor list? Every third party that touches PHI should be named, with BAA status for each.
  3. Which LLM provider do they use, and does that provider have a BAA in place? Consumer-tier AI APIs (ChatGPT free, Claude free) are never covered.
  4. Where is PHI stored at rest? Which object storage provider, which region, and is the data encrypted with AES-256?
  5. What is the data retention policy? Can you get zero-retention / ephemeral processing? What are the backup retention windows?
  6. Do application logs capture PHI? If so, which logging provider stores them, and is a BAA in place for that provider?
  7. Is PHI ever included in email notifications? The safest answer is "no, we send generic alerts with a secure link."
  8. What does the breach notification process look like? HIPAA ceiling is 60 days; best practice is 24-72 hours. Get the timeline in writing.
  9. Can you export audit logs? You need evidence packs for your own compliance records and auditors.
  10. What deployment options exist? SaaS, private cloud / VPC, or on-prem? Can you move between them as requirements change?

If a vendor can answer all ten clearly and produce documentation for each, they are worth evaluating further. If they stumble on more than two, keep looking.

Request the Zedly Healthcare Pack

Get our BAA-ready packet (BAA template, subprocessor list, retention policy, incident response summary) plus a 30-minute architecture review to pick SaaS vs VPC vs on-prem.

Request BAA Packet Book a HIPAA Deployment Consult

The Subprocessor Chain: Every Link That Touches PHI Needs a BAA

Your AI vendor does not operate in isolation. Behind every document processing platform is a stack of third-party services: LLM providers, storage backends, vector databases, logging systems, and more. Under HIPAA, the primary Business Associate (your vendor) is responsible for ensuring that any subprocessor handling PHI also maintains appropriate safeguards. You, as the Covered Entity, should verify this chain.

Here is what to audit, organized by subprocessor category:

1. LLM Inference Provider

This is the model that reads and reasons over your documents. If PHI is sent to an LLM API, the provider needs a BAA.

  • OpenAI: Offers a BAA for ChatGPT for Healthcare and qualifying API healthcare customers; eligibility and scope depend on the specific services and data-retention controls you use. Not available for consumer ChatGPT plans.
  • Anthropic: May provide a BAA for customers who qualify for its HIPAA-ready services (e.g., first-party API with required data-handling controls); does not automatically apply to consumer chat experiences.
  • Google (Vertex AI): BAA available as part of Google Cloud's HIPAA-covered services.
  • Self-hosted (Llama 3, Mistral, etc.): No BAA needed since the model runs on your infrastructure. Trade-off is hardware cost and potentially lower accuracy on specialized medical terminology.

Ask: "Which LLM do you use? Is the BAA with the LLM provider already in place, or does the customer need to arrange it separately?"

2. Vector Database

Embeddings derived from PHI documents are stored here for semantic search. If embeddings can be linked back to identifiable individuals (which they often can, through associated metadata), the vector database provider needs a BAA.

  • Zilliz Cloud: SOC 2 Type II certified. BAA available for enterprise customers.
  • Pinecone: SOC 2 Type II certified. BAA available on enterprise plans.
  • Self-hosted (Milvus, Qdrant, Weaviate): No BAA needed since it runs on your infrastructure.

3. Object Storage

Where uploaded documents and processed outputs are stored at rest. This is the most obvious subprocessor: it holds the actual PHI files.

  • Backblaze B2: SOC 2 Type II certified, offers BAAs, supports Object Lock (WORM) for immutable audit trails.
  • AWS S3: BAA available through AWS Business Associate Addendum. Most widely used in healthcare.
  • Azure Blob Storage: BAA available through Microsoft's standard healthcare compliance agreement.
  • Google Cloud Storage: BAA available through Google Cloud's BAA.

Key requirement: AES-256 encryption at rest, TLS 1.2+ in transit, and access logging enabled.

4. OCR / Document Parsing

If the vendor uses a third-party service to extract text from scanned PDFs or images, that service processes the raw document content, which includes PHI.

  • AWS Textract: Covered under AWS BAA.
  • Google Document AI: Covered under Google Cloud BAA.
  • Self-hosted (Tesseract, PaddleOCR): No BAA needed.

5. Email / Notification Service

Often overlooked, and the safest approach is simple: avoid putting PHI in email entirely. Most popular transactional email providers (SendGrid, Postmark, and others) are not HIPAA-eligible and will not sign a BAA. Twilio's own documentation explicitly states that SendGrid is not a HIPAA-eligible service.

  • AWS SES: Covered under AWS BAA. One of the few email services that can be used in a HIPAA-compliant workflow.
  • Most other email providers: Do not offer BAAs and should not be used to transmit PHI.

Best practice: Design all notifications to contain zero PHI. Send a generic alert ("A document has been processed") with a link to the secure platform, rather than including patient names, document titles, or processing results in the email body. If you must send PHI via email, use a provider that will sign a BAA for HIPAA-eligible email transmission, and confirm that specific service is covered.

6. Logging and Monitoring

Application logs frequently capture document metadata, user queries, and error messages that may contain PHI fragments. This is one of the most commonly missed subprocessors in HIPAA evaluations. See the Reference Stack below for specific providers and their BAA status.

Ask: "Do your application logs capture any PHI? If so, which logging provider stores them, and is a BAA in place?"

7. CDN / WAF / Edge Network

If PHI transits through a content delivery network or web application firewall, that provider may be processing PHI. Note that CDN/WAF BAAs are often restricted to enterprise-tier plans.

8. Payment Processor

Payment processors like Stripe typically do not handle PHI, and they are generally not HIPAA-compliant services. This is fine as long as you keep PHI completely separate from payment data. Do not include patient identifiers, diagnosis codes, or clinical information in invoice line items, subscription metadata, or transaction descriptions. If your billing workflow is properly separated from your PHI workflow, the payment processor does not need a BAA.

Subprocessor audit checklist (for each vendor in the chain):

  • Does this subprocessor create, receive, maintain, or transmit PHI?
  • Is a signed BAA in place between the AI vendor and this subprocessor?
  • What is the subprocessor's data retention policy?
  • Is the subprocessor SOC 2 Type II audited?
  • Where is data stored geographically?
  • What happens to data upon contract termination?

What a Credible BAA Packet Looks Like

When you ask an AI document vendor for their "HIPAA compliance packet," you are testing two things: (1) whether they actually have one, and (2) whether it contains substance or just marketing reassurance. Here is what a complete packet includes and what each component should contain.

1. The BAA Itself

A signed Business Associate Agreement that includes:

  • Permitted uses and disclosures of PHI (specific to document processing, not a generic template)
  • Safeguards commitment: Administrative, physical, and technical safeguards the vendor implements
  • Breach notification terms: Timeframe for notification (HIPAA ceiling is 60 days; best practice is 24-72 hours), format of notification, cooperation obligations
  • Subprocessor requirements: Vendor's obligation to ensure subprocessors maintain equivalent safeguards
  • Data return and destruction: What happens to PHI when the contract ends. Destruction certificates are ideal.
  • Audit rights: Your right to audit or request evidence of compliance

Red flag: If the BAA is a one-page document that reads like a marketing disclaimer, it is not a real BAA. A substantive BAA is typically 5-15 pages and addresses specific operational scenarios.

2. Subprocessor List

A current, complete list of every third-party service that may handle PHI. For each subprocessor, the list should include:

  • Company name and service provided
  • What PHI they handle (storage, processing, transit)
  • Geographic location of data processing
  • BAA status (in place / not applicable)
  • SOC 2 or equivalent certification status

Red flag: "We don't share data with any third parties" is almost never true for a SaaS product. If a vendor cannot produce a subprocessor list, they either have not thought through their compliance chain or are not being transparent.

3. Data Retention and Destruction Policy

A written policy covering:

  • How long PHI is retained in production systems
  • Whether zero-retention or ephemeral processing is available
  • Backup retention periods and encryption standards
  • Data destruction procedures (secure delete, crypto-shredding, physical destruction)
  • Destruction certification upon request

4. Audit Logging Documentation

Evidence that the vendor maintains comprehensive audit trails:

  • What events are logged (document access, user actions, configuration changes, admin access)
  • Log retention period
  • Whether logs are exportable for your own audit purposes
  • Tamper-evidence measures (append-only logging, hash chains, or WORM storage)

5. Incident Response Summary

A documented incident response plan covering:

  • How potential breaches are detected (monitoring, anomaly detection, employee reporting)
  • Internal escalation procedures
  • Customer notification process and timeline
  • Forensic investigation capabilities
  • Post-incident review and remediation process

6. Security Documentation

Supporting evidence of the vendor's security posture:

  • SOC 2 Type II report (or bridge letter if a new audit is in progress)
  • Penetration test summary (date, scope, critical findings status)
  • Encryption standards: AES-256 at rest, TLS 1.2+ in transit, key management approach
  • Access control documentation: How employee access to production systems is managed

If a vendor can produce all six components without hesitation, that is a strong signal. If they need weeks to assemble a packet, or respond with "we're working on it," factor that into your evaluation timeline.

Reference Stack: BAA-Available and HIPAA-Ready Services by Category

This section is not an endorsement of any specific vendor. It is a reference list of services known to offer BAAs or operate as HIPAA-ready services, commonly used in AI document processing stacks. Always verify current BAA availability and terms directly with the vendor, as policies and eligible services change.

LLM Inference

  • OpenAI: GPT-4o and successors. Offers a BAA for ChatGPT for Healthcare and qualifying API healthcare customers; eligibility and scope depend on the specific services and data-retention controls you configure. Zero data retention (ZDR) controls available. Strong general reasoning, but test accuracy on specialized medical terminology against your specific use case before committing. Consumer ChatGPT is not covered.
  • Anthropic: Claude models. Anthropic may provide a BAA for customers who qualify for its HIPAA-ready services (e.g., first-party API / Enterprise with required data-handling controls). Does not automatically apply to consumer chat experiences. Known for longer context windows, which is useful for multi-page clinical documents.
  • Google Vertex AI: Gemini models. BAA available as part of Google Cloud's HIPAA-covered services. Deep integration with Google Cloud ecosystem.
  • Self-hosted (Llama 3, Mistral, Qwen): No BAA needed. Requires GPU hardware (A100/H100 for production workloads). Open-source models are improving rapidly but may require fine-tuning for clinical vocabulary.

Object Storage

  • Backblaze B2: SOC 2 Type II, BAA available. Object Lock (WORM) for immutable audit trails. Competitively priced relative to hyperscaler storage. Independent of AWS/GCP/Azure, which can be an advantage for organizations wanting to avoid vendor concentration risk.
  • AWS S3: BAA available through AWS Business Associate Addendum. The default choice for organizations already in the AWS ecosystem. Comprehensive lifecycle policies and versioning.
  • Azure Blob Storage: BAA available through Microsoft. Integrates with Azure's broader healthcare compliance tooling (Azure Health Data Services).

Vector Databases

  • Zilliz Cloud (managed Milvus): SOC 2 Type II, BAA available for enterprise customers. Purpose-built for high-dimensional embeddings at scale.
  • Pinecone: SOC 2 Type II, BAA available on enterprise plan. Serverless and pod-based deployment options.
  • Self-hosted Milvus, Qdrant, or Weaviate: No BAA needed. Full control over data, but you manage the infrastructure.

Logging and Monitoring

  • Datadog: BAA available for HIPAA-eligible services on Enterprise plan. If you use Datadog for HIPAA workflows, confirm which specific Datadog services are covered under their HIPAA offering/BAA, as not all products may be included.
  • AWS CloudWatch: Covered under AWS BAA as a HIPAA-eligible service.
  • Self-hosted logging (ELK stack, Grafana Loki): No BAA needed, but you are responsible for securing the logs.

CDN / WAF / Edge Network

  • Cloudflare: Offers a BAA for HIPAA workloads; confirm plan eligibility and the exact covered services with Cloudflare Sales or your order form.
  • AWS CloudFront + WAF: Covered under AWS BAA as HIPAA-eligible services.

Infrastructure and Cloud

  • AWS: Broadest set of HIPAA-eligible services. AWS GovCloud for FedRAMP requirements.
  • Google Cloud: BAA covers 100+ services including Compute Engine, Cloud Run, and Vertex AI.
  • Microsoft Azure: BAA available. Azure Government for FedRAMP. Strong healthcare vertical with FHIR-compliant APIs.

When evaluating a vendor's stack, map each component to this reference list and verify BAA coverage. Gaps in the subprocessor chain represent compliance risk, regardless of how secure the primary vendor's application layer is.

De-Identification: When You Can Skip the BAA Entirely

Not every AI workflow involving healthcare documents requires a BAA. If data is properly de-identified before it reaches the AI tool, it is no longer PHI under HIPAA, and the standard Business Associate requirements do not apply.

HIPAA recognizes two methods of de-identification (45 CFR 164.514):

Safe Harbor Method

Remove all 18 categories of identifiers specified by HHS: names, geographic data smaller than a state, dates (except year) related to an individual, phone numbers, email addresses, Social Security numbers, medical record numbers, health plan beneficiary numbers, account numbers, certificate/license numbers, vehicle identifiers, device identifiers, URLs, IP addresses, biometric identifiers, full-face photos, and any other unique identifying number. Additionally, the covered entity must have no actual knowledge that the remaining information could identify an individual.

Expert Determination Method

A qualified statistical or scientific expert determines that the risk of identifying an individual from the data is "very small," and documents the methods and results supporting that determination.

Trade-Offs

  • Advantage: De-identified data can be processed by any AI tool without a BAA, opening up a wider range of (potentially cheaper, faster) tools
  • Disadvantage: De-identification itself must happen before the data reaches the non-BAA-covered tool, which means you need a HIPAA-compliant de-identification process upstream. The de-identification step itself involves PHI.
  • Risk: Inadequate de-identification (missing identifiers, re-identification through context) leaves you exposed. When in doubt, treat the data as PHI.

De-identification is a viable strategy for specific use cases: aggregate analytics, research datasets, training data preparation, and population health analysis. For workflows that require the AI tool to reason over complete patient records (clinical note summarization, prior authorization, claims adjudication), de-identification usually removes the information the AI needs. In those cases, a BAA-covered tool is the only compliant option.

Putting It Together: A Due Diligence Workflow

When evaluating an AI document processing vendor for HIPAA-covered workloads, work through this sequence:

  1. Confirm BAA availability. Can the vendor sign a BAA today, on your terms or mutually agreeable terms? If BAA is "coming soon" or "available on enterprise plans only," factor that into your timeline and budget.
  2. Request the subprocessor list. Map every subprocessor to the checklist categories above. Identify gaps where a subprocessor handles PHI but does not have a BAA in place with the vendor.
  3. Evaluate the deployment model. Determine whether SaaS, private cloud, or on-prem fits your risk profile and operational capacity. Ask about data residency options.
  4. Review data retention and destruction. Confirm that retention policies align with your requirements. Ask about zero-retention or ephemeral processing options for sensitive workloads.
  5. Test with real documents. Before signing anything, run representative documents (with test/synthetic data, not actual PHI) through the platform. Evaluate extraction accuracy, processing speed, and output quality for your specific document types.
  6. Review audit logging. Confirm what events are captured, how long logs are retained, and whether you can export them for your own compliance records.
  7. Check incident response. Review the vendor's breach notification procedures. Confirm that notification timelines in the BAA align with your internal requirements.
  8. Consult legal counsel. Have your healthcare attorney review the BAA and any supplemental agreements before execution. HIPAA compliance is ultimately your organization's responsibility as the Covered Entity.

This process typically takes 2-6 weeks depending on vendor responsiveness and your internal review cycles. For organizations exploring private AI deployment for professional services, the same vendor due diligence framework applies.

Frequently Asked Questions

Does my AI document vendor need a BAA?

In nearly all cases, yes. Under HIPAA, any vendor that creates, receives, maintains, or transmits Protected Health Information (PHI) on behalf of a covered entity is a Business Associate. If your AI tool processes documents containing PHI, the vendor must sign a BAA before any PHI is shared. This applies regardless of whether the vendor stores data long-term or processes it transiently in memory. The only narrow exception is for services acting purely as a conduit for transmission (similar to an ISP), which does not apply to AI document processing tools.

Is there a HIPAA certification for AI tools?

No. There is no official HIPAA certification issued by HHS or any government body. Any vendor claiming to be "HIPAA certified" is using the term loosely. HIPAA compliance is self-attested and demonstrated through administrative, physical, and technical safeguards, documented policies, and willingness to sign a BAA. Look for SOC 2 Type II audits, penetration test reports, and a clear BAA as evidence of compliance posture.

What should a BAA with an AI vendor include?

A BAA should include: permitted uses and disclosures of PHI, safeguards the vendor will implement, breach notification obligations and timelines, subprocessor disclosure requirements, data retention and destruction terms, audit and access rights, and termination provisions including return or destruction of PHI. The BAA should also specify whether the vendor can de-identify data and under what conditions.

Can I use OpenAI or Anthropic APIs for HIPAA-compliant document processing?

It depends on the product and plan. OpenAI offers a BAA for ChatGPT for Healthcare and qualifying API healthcare customers; eligibility depends on the specific services and data-retention controls you use. Anthropic may provide a BAA for customers who qualify for its HIPAA-ready services (e.g., first-party API with required data-handling controls). Google offers a BAA for Vertex AI. In all cases, the BAA must be signed before PHI is sent through the API. Consumer-facing products (consumer ChatGPT, Claude free tier) are not covered and should never be used with PHI. Always confirm BAA availability for the specific product and plan you are using.

What is a subprocessor in the context of HIPAA?

A subprocessor is a third-party service that your AI vendor uses to fulfill its obligations. For example, if your vendor uses AWS for hosting, Backblaze for storage, and OpenAI for inference, each of those is a subprocessor. Under HIPAA, the primary Business Associate is responsible for ensuring that any subprocessor handling PHI also has appropriate safeguards and, ideally, a BAA in place. You should request a complete subprocessor list from any vendor. For more on how data flows through AI document platforms, see our data lifecycle documentation.

Is cloud-hosted AI ever HIPAA compliant?

Yes. Cloud hosting does not disqualify a tool from HIPAA compliance. What matters is the implementation: encrypted data at rest and in transit, access controls, audit logging, BAAs with the cloud provider (AWS, GCP, Azure, and others all offer BAAs), and proper data retention policies. Many healthcare organizations use cloud-hosted tools with appropriate safeguards. On-premises deployment is not required by HIPAA, though some organizations prefer it for additional control.

How long can an AI vendor retain PHI?

HIPAA does not prescribe a specific retention period for Business Associates. Retention terms are defined in the BAA. Best practice is to retain PHI only as long as necessary to fulfill the purpose for which it was provided, then securely destroy it. Some vendors offer zero-retention processing where documents are analyzed in memory and never written to persistent storage. Your BAA should explicitly state retention periods and destruction procedures.

What happens if my AI vendor has a data breach involving PHI?

Under HIPAA's Breach Notification Rule (45 CFR 164.400-414), a Business Associate must notify the Covered Entity without unreasonable delay and no later than 60 days after discovery of a breach of unsecured PHI. The Covered Entity then must notify affected individuals, HHS, and potentially the media (for breaches affecting 500+ individuals). Your BAA should specify breach notification timelines, incident response procedures, and cooperation requirements. Penalties for non-compliance are inflation-adjusted annually and can exceed $2 million per violation category per year; check the current HHS penalty schedule.

What Zedly Will Sign, and What We Will Not Do

Healthcare buyers want explicit boundaries, not marketing language. Here is where Zedly stands:

What we will do

  • Sign a BAA for qualifying plans (Business and Enterprise tiers)
  • Provide a current subprocessor list with BAA status for each service that handles PHI
  • Support PHI-safe notification defaults: No PHI in email. Generic alerts with a secure link to the platform.
  • Offer zero-retention and reduced-retention modes. Our Active Desk processes documents in ephemeral memory and does not write PHI to persistent storage unless you explicitly store it to Vault. Vault storage uses AES-256 encryption at rest with configurable retention periods.
  • Provide deployment flexibility: SaaS with BAA, private cloud / VPC deployment, or air-gapped on-premises with local LLM inference (Llama 3/Mistral)
  • Export audit logs for your own compliance records and auditors

What we will not do

  • We do not train models on customer data. Your documents are never used for model training, fine-tuning, or improvement of any kind.
  • We do not share PHI across workspaces. Each organization's data is isolated. There is no cross-tenant data access.
  • We do not claim "HIPAA certified." That designation does not exist. We demonstrate compliance through documented safeguards, SOC 2 Type II-certified subprocessors, and a BAA we are willing to sign.

For healthcare organizations evaluating deployment options, we also provide medical document categorization, clinical note processing, and specialized embedding models optimized for medical terminology.

Looking for healthcare-ready AI?

See how Zedly AI handles PHI extraction, intake form processing, and clinical document search — with deployment options designed for strict data policies. Explore Healthcare AI →

Comparing enterprise AI platforms?

See a detailed breakdown of deployment, compliance, pricing, and document features.

Ready to get started?

Private-by-design document analysis with strict retention controls.