GDPR

Why Sending Documents to ChatGPT May Breach GDPR

SumoScan Team · May 2026 · 7 min read

It happens dozens of times a day in organisations across Europe. A lawyer needs a quick translation of a client contract. An HR manager wants to redact a CV before sharing it. A compliance officer needs to summarise a regulatory document.

They open ChatGPT, paste the document, and get the result in seconds.

It feels harmless. It is almost certainly not.

This article explains — in plain terms — why uploading business documents to ChatGPT and similar consumer AI tools creates serious GDPR exposure for your organisation, and what a compliant alternative looks like.

The Core Problem: Your Organisation Is the Data Controller

When an employee uploads a document to ChatGPT, your organisation does not stop being responsible for the personal data in that document. Under GDPR, the organisation that collects and determines how personal data is used is the data controller — and that responsibility does not transfer to OpenAI just because your employee used their tool.

This means:

Your organisation remains liable for how that data is processed
You need a lawful basis for sending that data to a third party
You need a Data Processing Agreement (DPA) with that third party
You need to be able to demonstrate compliance if a regulator asks

Most organisations using consumer ChatGPT for document work cannot satisfy any of these requirements.

Problem 1: No Data Processing Agreement for Free and Plus Users

GDPR Article 28 is unambiguous: if you use a third party to process personal data on your behalf, you must have a written Data Processing Agreement in place before any processing begins.

OpenAI provides a DPA — but only for ChatGPT Enterprise, ChatGPT Business, and API customers. The free consumer version and ChatGPT Plus subscription do not come with a DPA.

Using ChatGPT in a corporate environment is only data-protection compliant if no personal data is processed — or if a valid Data Processing Agreement is in place. Without one, there is no clear contractual basis for processing under GDPR.

The practical reality is that the vast majority of business documents contain personal data. Names appear in email chains. Phone numbers sit in contact details. IBANs appear in invoices. Passport numbers appear in HR files. Client addresses appear in contracts.

If your staff are uploading any of these to consumer ChatGPT without a signed DPA, your organisation is processing personal data without the required contractual safeguards. That is a GDPR violation.

Problem 2: Data May Be Used to Train AI Models

For users of the free consumer version of ChatGPT, uploaded content may be used to improve OpenAI's models — depending on account settings.

For Free and Plus account holders, chats are stored indefinitely unless deleted, and are then scheduled for permanent deletion from the system within 30 days. If "Chat History and Training" is not disabled, content may be used for model training.

For a law firm uploading client contracts, an HR team uploading personnel files, or a compliance team uploading regulatory submissions, the prospect of that content being used to train a commercial AI model is not just a GDPR concern — it is a fundamental breach of confidentiality obligations to clients and employees.

Even when training is disabled, a copy is retained for up to 30 days for abuse and misuse monitoring. For documents containing special category data — health records, financial details, legal privilege material — even temporary retention on servers outside your control creates significant compliance risk.

Problem 3: Data Leaves the EU

OpenAI is a US-based company. When you upload a document to ChatGPT, that data is processed on servers in the United States.

Under GDPR Chapter V, transferring personal data outside the European Economic Area requires specific legal mechanisms — typically Standard Contractual Clauses (SCCs) plus a Transfer Impact Assessment. This requirement applies regardless of whether the transfer is intentional or incidental.

OpenAI references Standard Contractual Clauses for international transfers, but organisations still need to assess whether supplementary measures are sufficient for the specific use case following Schrems II. US surveillance laws potentially grant the government access to data stored by American companies.

For organisations in regulated industries — legal, healthcare, finance, public administration — the inability to guarantee that client documents remain within EU jurisdiction is not a theoretical concern. It is a practical compliance failure that supervisory authorities are increasingly focused on.

Problem 4: No Audit Trail

When a document is processed by ChatGPT, there is no record of what was done to it, what data was contained in it, or where that data went.

For a DPO trying to demonstrate GDPR compliance to a supervisory authority, or a legal team responding to a data subject access request, this absence of documentation is itself a problem.

GDPR requires organisations to be able to demonstrate compliance — not just claim it. If you cannot show an auditor what happened to a document containing personal data, when it was processed, and by whom, you are not meeting the accountability principle under GDPR Article 5(2).

This is not an abstract concern. The ICO and EU Data Protection Authorities are paying attention to AI. The Italian DPA temporarily banned ChatGPT in 2023 over GDPR concerns. In February 2026, the ICO fined MediaLab.AI £247,590 for processing children's data without a DPIA. Regulators are not waiting for complaints — they are proactively investigating AI systems that process personal data.

Problem 5: Your Organisation Is Liable for Employee Use

Many organisations assume that if an employee uses a personal ChatGPT account on a work device, the organisation is not responsible for what they upload.

This is incorrect.

Under GDPR, the data controller — your organisation — is responsible for all personal data processing, including processing carried out by employees acting within the scope of their employment.

This means that an employee translating a client contract on their personal ChatGPT account, during work hours, for work purposes, creates GDPR liability for the organisation — not just for the individual.

Without clear policies, training, and technical controls governing how staff use AI tools with business documents, organisations are exposed to liability they may not even be aware of.

What About ChatGPT Enterprise?

It is worth being fair here. ChatGPT Enterprise, when properly configured, addresses several of these concerns:

A DPA is available and can be signed
Data is not used for model training
Zero data retention can be enabled
SOC 2 Type 2 compliance is maintained

Using the ChatGPT API with proper configuration — DPA signed, zero-retention enabled, EU data residency considered — is a different matter entirely and can be made GDPR compliant.

However, even with Enterprise, organisations face residual risks:

Data residency: Enterprise does not guarantee EU-only processing by default
Generative outputs: ChatGPT can still hallucinate — inventing names, misquoting figures, or introducing errors into translated documents
No Redaction Certificate: There is no verifiable audit record of what PII was detected and removed
Cost and complexity: Enterprise configuration requires technical expertise and ongoing governance

For organisations that need to demonstrate EU AI Act compliance alongside GDPR compliance, the additional documentation and governance burden of configuring a US-based enterprise AI tool to meet EU requirements is substantial.

The Special Category Problem

GDPR Article 9 imposes stricter rules on the processing of special category data — health information, ethnic origin, religious beliefs, biometric data, trade union membership, and similar sensitive categories.

Business documents frequently contain this data in ways that are not immediately obvious:

A CV may mention a disability or medical condition
An HR file may contain sick leave records
A legal document may reference a client's nationality
An insurance claim may contain medical details
A contract may identify an individual's religious affiliation

Special category data requires explicit consent or a specific legal basis, and the highest-risk document types should never be shared with AI tools without full de-identification.

For organisations in healthcare, HR, legal services, and financial services, the risk of inadvertently uploading special category data to a consumer AI tool is not hypothetical — it is routine.

What GDPR-Compliant Document AI Looks Like

Compliant document processing is not about avoiding AI — it is about using AI in a way that satisfies your obligations as a data controller.

The practical requirements are:

1. EU-hosted infrastructure

Document processing must happen within EU jurisdiction. This eliminates the cross-border transfer issue entirely and removes the need for SCCs and Transfer Impact Assessments.

2. Zero data retention

Documents must not be stored after processing is complete. In-memory processing that clears immediately on completion is the cleanest approach — there is no retained data to worry about.

3. No model training

Documents uploaded for processing must never be used to improve AI models. This must be contractually guaranteed in a DPA, not just a settings toggle.

4. A Data Processing Agreement

A valid GDPR Article 28 DPA must be in place before any document processing begins. This is not optional — it is a legal requirement.

5. An audit trail per job

Every document processing job should generate a verifiable record of what was processed, what was found, and what was done. For PII redaction specifically, a timestamped Redaction Certificate is the standard your auditors and legal counsel will expect.

6. Deterministic outputs

For legal and compliance workflows, the processing must be predictable and repeatable. Generative AI that may hallucinate has no place in document workflows where accuracy is non-negotiable.

A Practical Checklist for DPOs

Before approving any AI document processing tool for use in your organisation, ask these questions:

Is there a signed GDPR Article 28 DPA in place?
Does processing happen entirely within EU infrastructure?
Is there a contractual guarantee of zero data retention?
Is the tool's AI non-generative — i.e. deterministic and hallucination-free?
Does every processing job generate an audit trail or Redaction Certificate?
Have you recorded this processing activity in your ROPA?
Have you completed a DPIA if the processing poses high risk?
Are staff trained on which document types must not be uploaded to consumer AI tools?

If you cannot answer yes to all of these questions for every AI tool your organisation uses with documents, you have compliance gaps that need to be addressed before August 2026.

Summary

Sending business documents to consumer ChatGPT creates multiple GDPR risks that most organisations are not adequately managing:

No DPA for free and Plus accounts
Potential model training on uploaded content
Data processed on US servers outside EU jurisdiction
No audit trail of document processing
Organisation liability for employee use
Heightened risk for special category data

The solution is not to avoid AI — it is to use AI tools that were built with GDPR compliance as a foundation, not as an afterthought.

SumoScan processes documents entirely within EU infrastructure in Cork, Ireland. Zero data retention. No model training. A signed Redaction Certificate on every PII redaction job. Data Processing Agreements available for enterprise teams.

Start free at sumoscan.ai · View our Trust Centre · Talk to us about Enterprise

← Back to Blog