Where Your Data Actually Goes
You're about to upload a file with employee salary data. Or customer emails. Or revenue numbers your board hasn't seen yet. Where does this actually go?
That's a reasonable question. Most AI tools answer it with a trust badge and a link to a privacy policy nobody reads. Here's the actual answer, in plain English, for heyanna.
The full journey of your file
Your file touches four systems. Here's each one, what it does, and what happens to your data at every step.
1. Upload: Cloudflare R2
When you drag a CSV into heyanna, it goes to Cloudflare R2 — an object storage service. Think of it as a secure filing cabinet in the cloud. The file sits there so Anna can reference it during your analysis.
R2 is Cloudflare's infrastructure. No data leaves Cloudflare's network for storage. No third-party storage providers. Your file is stored encrypted at rest with AES-256.
2. Metadata and structure: Cloudflare D1
The database that tracks your datasets, conversations, and reports runs on Cloudflare D1 — a SQLite-based database that runs at the edge. It stores metadata: file names, column types, row counts, your conversation history.
Your actual data values — the numbers in the cells, the text in the rows — are in R2. D1 knows the shape of your data, not the contents.
3. Analysis: your browser (Pyodide)
This is the part that surprises people.
When Anna runs Python analysis — statistical tests, data transformations, chart generation — that code executes in your browser. Not on a server. Not in the cloud. In a WebAssembly sandbox running Pyodide (a full Python environment compiled to run in the browser).
Your data gets pulled into the browser tab, the Python runs locally, and the results render on screen. The computation happens on your machine. heyanna's servers never see the intermediate calculations, the transformed datasets, or the generated charts. They're produced and rendered client-side.
This is a deliberate architectural decision, not a limitation. Browser-based execution means your data doesn't travel to a compute server for processing. It stays in the tab.
4. AI reasoning: Anthropic's Claude API
When Anna thinks about your data — interpreting patterns, deciding which statistical test to run, writing the narrative for your report — she uses Claude, Anthropic's large language model.
This means: yes, parts of your data are sent to Anthropic's API. Specifically, the parts Anna needs to reason about — column names, summary statistics, sample values, and the results of her analysis.
Here's what matters:
Anthropic does not train on API data. Anthropic's API terms prohibit using customer data for model training. Data sent through the API is used to generate a response and is not used to train or improve their models. This applies to all API customers, including heyanna.
Anna doesn't send your entire file. She sends what she needs for the current reasoning step — typically column metadata, aggregated statistics, and small samples. The full dataset stays in R2 and in your browser.
If you're evaluating heyanna for sensitive data, the key question is whether Anthropic's API data handling meets your requirements. Their data processing terms are public and contractual, not just policy.
What heyanna does not do
Sometimes the clearest way to explain a security posture is to list what doesn't happen.
- No training on your data. Not by heyanna. Not by Anthropic. Not by anyone.
- No sharing with third parties. Your data is not sold, syndicated, or used for benchmarking.
- No persistent server-side copies of analysis results. The Python execution happens in your browser and the results stay there unless you save a report.
- No cross-user data access. Your datasets, conversations, and reports are scoped to your account. There is no shared data layer between users.
What about deletion?
When you delete a dataset in heyanna, three things happen:
- The file is removed from R2 (object storage)
- The metadata is removed from D1 (database)
Deletion is immediate and permanent. There's no "soft delete" period, no 30-day retention window, no backup you can't control.
If you're subject to GDPR or have specific data residency requirements, reach out — Cloudflare R2 supports regional storage.
The SOC 2 question
If you work in HR, finance, healthcare, or any regulated industry, you're probably looking for a SOC 2 Type II badge. Fair.
heyanna does not have SOC 2 certification yet. That's the honest answer. It's on the roadmap — infrastructure choices (Cloudflare's platform, browser-based computation, Anthropic's API terms) were made with this path in mind. But the audit hasn't happened yet.
If SOC 2 is a hard requirement today, let us know at [email protected] - let's chat about your needs, and hey anna's SOC 2 timeline. If your security review is based on understanding the actual data flow and infrastructure rather than a compliance badge, this post gives you the full picture.
The infrastructure stack, summarized
| Layer | Technology | What it handles | Where it runs |
|---|---|---|---|
| File storage | Cloudflare R2 | Raw uploaded files | Cloudflare's network |
| Database | Cloudflare D1 | Metadata, conversations, reports | Cloudflare edge |
| Computation | Pyodide (WebAssembly) | Python analysis, charts, transforms | Your browser |
| AI reasoning | Anthropic Claude API | Pattern interpretation, narrative | Anthropic's infrastructure |
| Application | Cloudflare Workers | API, auth, routing | Cloudflare edge |
If your security team needs a more detailed technical review, reach out directly. We're happy to walk through the architecture, data flow, and Anthropic's contractual terms in detail.
Your data goes to Cloudflare for storage, your browser for computation, and Anthropic for reasoning. Nobody trains on it. Delete it any time — it's gone.
Security questions your team needs answered beyond this? [email protected]
We use cookies to improve your experience. Privacy policy