I Built an AI Data Analyst That Never Sees Your Data

TL;DR: I built an AI-powered data analyst that runs entirely in the browser. The AI generates SQL from your schema (column names and types) — it never sees your actual data. DuckDB WebAssembly handles the queries locally. Here's why I built it, the architecture behind it, and what I learned.

The moment that started this

I was working as a data analyst at a fintech company. One afternoon, a product manager pinged me: "Can you check how many users in segment X hit this edge case last month?"

Simple question. Five-minute answer if I could just run a query.

The data was in a CSV export from our analytics platform. It had user IDs, transaction amounts, email addresses. I needed to aggregate it. So I did what I always did — opened a Jupyter notebook, loaded it with pandas, wrote the query.

Then I watched a colleague do the same thing differently. They dragged the CSV into ChatGPT and typed "how many users in segment X hit this edge case last month?"

Same answer. Ten seconds instead of five minutes. And every row of customer financial data was now on OpenAI's servers.

They weren't careless. They were just doing the fastest thing available. And that's the problem: the fastest path and the safest path are almost never the same tool.

I started wondering — do they have to be different?

The schema-only insight

The breakthrough (which in retrospect seems obvious) was this: AI doesn't need your data to analyze it. It needs your schema.

When you ask "What's the average order value by region?", the AI needs to know:

There's a column called order_value (type: decimal)
There's a column called region (type: varchar)
The table is called orders

From that, any competent language model generates:

SELECT region, AVG(order_value) as avg_order_value
FROM orders
GROUP BY region
ORDER BY avg_order_value DESC

The SQL runs locally. The AI never sees that the West region averages $142.50 or that Jane Smith placed order #48291. It wrote the question. Your machine answered it.

This works because most analytical questions on structured data are translatable to SQL from schema alone. You don't need GPT-4 to read your spreadsheet. You need it to translate English into SQL — a task that requires knowing column names and types, not row values.

The architecture

Here's what QueryVeil actually looks like under the hood:

                    +-----------------------+
                    |   Your Browser Tab    |
                    |                       |
  File API          |  +----------------+   |
  (no upload) ----->|  | DuckDB-WASM    |   |
                    |  | (SQL engine)   |   |
                    |  +-------+--------+   |
                    |          |             |
                    |  schema  |  query      |
                    |  extract |  results    |
                    |          v             |
                    |  +----------------+   |
                    |  | Query Results  |   |
                    |  | & Charts       |   |
                    |  +----------------+   |
                    +-----------+-----------+
                                |
                    schema only |  (column names,
                    (no rows)   |   types, table name)
                                v
                    +-----------------------+
                    |   AI Model            |
                    |   (generates SQL)     |
                    +-----------------------+

Three layers, each with a clear trust boundary:

Layer 1: Data engine (your browser). DuckDB compiled to WebAssembly runs inside the browser sandbox. Files load from disk via the File API — the standard browser mechanism for reading local files. No network request. The data exists in browser memory and nowhere else.

Layer 2: Schema extraction (your browser). After loading the file, we run DESCRIBE table_name and SELECT column_name, data_type FROM information_schema.columns to get the schema. Column names, data types, nullable flags. That's all that leaves the browser.

Layer 3: AI query generation (cloud or local). The schema plus your natural language question goes to an LLM. The model returns SQL. That SQL goes back to Layer 1 and runs locally. Results render in the browser.

If you use the free tier with WebLLM or Ollama, even Layer 3 stays on your machine. The paid tier routes through OpenRouter to models like Claude, GPT-4, or Gemini — but still only sends the schema.

What I got wrong at first

The first version was bad. Here's what I learned:

Schema alone isn't always enough context. For some questions, the AI generates ambiguous SQL because column names are cryptic. col_a, val_1, status_code — these don't give the model much to work with. The fix was auto-profiling: we run basic stats (distinct values for low-cardinality columns, min/max for numerics, sample value formats for strings) and include a compressed summary in the prompt. This is metadata about the data, not the data itself.

Multi-step analysis needs state. A single question rarely gets you the answer. "Show me churn by segment" leads to "Why is enterprise churn higher?" which leads to "When did this trend start?" An AI agent that chains queries together — using each result to inform the next question — is dramatically more useful than a one-shot SQL generator. We built this as a LangGraph agent that can plan multiple queries, inspect intermediate results, and adjust its approach.

DuckDB-WASM has real limits. It runs in a browser tab, which means it shares memory with everything else. Files over ~500MB start pushing browser memory limits. Parquet files perform better than CSVs at scale because DuckDB can read them column-by-column. We added clear error messages when you hit the wall instead of just crashing the tab.

People don't trust "local" claims. And they shouldn't, blindly. We made the network tab verifiable — load a file in QueryVeil, open Chrome DevTools, switch to the Network tab, and confirm: no requests carrying your data. If you can't verify it, you shouldn't believe it.

The local AI option

The cloud AI model is convenient, but some users can't send even schema to an external API. Government contractors. Healthcare analysts. Anyone with a strict data classification policy where column names themselves are considered sensitive (e.g., a column called ssn or diagnosis_code reveals something about the data).

For these users, we added two local AI options:

WebLLM: Runs a small language model (like Llama or Phi) directly in the browser via WebGPU. No server, no API, no network. The tradeoff is model quality — smaller models make more SQL mistakes.
Ollama: Runs larger models locally on your machine. Better quality than WebLLM, but requires installing Ollama separately.

Both options mean the entire pipeline — data, schema, AI, queries, results — stays on your hardware. Fully air-gapped analysis is possible.

What I'd tell other builders

If you're building tools that touch user data, here's what I took away:

Question the default architecture. "Upload to server, process, return results" is the path of least resistance for developers. It's not always necessary. Ask whether the computation can move to the client.
Separate what the AI needs from what it doesn't. For structured data analysis, the AI needs the question and the schema. It does not need the data. Design your prompts around this boundary.
Make privacy verifiable, not just claimed. "We don't store your data" on a marketing page means nothing. Let users open DevTools and check. That's real trust.
Local-first doesn't mean local-only. The architecture should gracefully upgrade: local AI for maximum privacy, cloud AI for maximum quality, same UX either way.
DuckDB-WASM is production-ready. It handles CSVs, Excel (via the spatial extension), Parquet, and JSON. The SQL dialect is full-featured with window functions, CTEs, and aggregations. It's not a toy.

Where this is going

We're adding support for connecting to remote databases (Postgres, BigQuery) with the same schema-only AI pattern. The query still runs on the remote database — not through us — and the AI still only sees the schema.

We're also experimenting with result-aware follow-ups, where a capped sample of query results (say, 50 rows) feeds back into the AI context for richer multi-step analysis. This is an explicit tradeoff: more AI capability in exchange for limited data exposure. It will always be opt-in and clearly labeled.

The goal hasn't changed: make the fastest path and the safest path the same tool.

If you want to try this, QueryVeil has a live demo with sample data — no signup required. Load your own file and check the Network tab if you don't believe me.