The Canonical Stack for AI Agents and Databases

Spencer Pauly

May 4, 2026•10 min read

If you're shipping an AI feature in 2026 and you haven't figured out database access, you're behind. If you've figured it out by handing the agent a prod Postgres URL, you're behind in a different and more dangerous way.

This is the pillar piece on what I think the right shape is. It's longer than my usual posts because I want to lay out the full stack — every layer, what each one actually does, where the failure modes are, and what the bare minimum is to ship safely.

I've been building in this space for a year and I've watched a lot of teams try a lot of patterns. Most of them eventually converge on something like the stack below. I'd rather you skip the ones that don't work.

The shape, top to bottom

[ user ] -> [ MCP client ]    (Claude Desktop, Claude Code, Cursor, ChatGPT)
              |
              v
          [ MCP server ]      (the agent-facing tool surface)
              |
              v
         [ gateway ]           (parser, allowlist, masking, limits, audit)
              |
              v
       [ read replica ]        (resource-isolated copy of prod)
              |
              v
        [ prod Postgres ]      (untouched by the agent, by design)

Every layer does work the layer below it can't, and every layer fails in a way the next layer below catches. Let's walk through them.

Layer 1: the MCP client

The agent is somewhere — Claude Desktop, Claude Code, Cursor, ChatGPT with MCP support, your own custom client. This is where the model lives, where the user interacts, where the conversation happens.

The client's job in the database stack is small but important: deciding which tool calls to auto-invoke and which to confirm.

The right policy is asymmetric:

Read tools auto-invoke. get_schema, list_connections, run_query against a gateway you trust — fine to fire automatically.
Anything else confirms. send_email, update_record, delete_user — explicit user confirmation, always. The user clicks "yes" or "no" before the action runs.

The clients do this with varying enthusiasm. Claude Code asks for tool-call confirmation by default. Some others auto-invoke everything by default. Check the setting before you wire up anything destructive.

Layer 2: the MCP server

This is the thing the agent thinks of as "the database tool." It exposes a fixed set of tools — typically list_connections, get_schema, run_query — over MCP.

The server's job is twofold:

Speak MCP correctly to the client.
Hand every operation off to the gateway, never directly to the database.

If you read the source for thin reference MCP servers, you'll see them go from MCP request straight to a Postgres connection. That's the wrong shape. The MCP server should be a translation layer to the gateway, not the gateway itself.

In QueryBear's case, the MCP server is a few hundred lines of TypeScript that turns MCP tool calls into gateway API calls. It has no direct database connection. That's deliberate — even if the MCP server is compromised, it can't read your Postgres directly.

Layer 3: the gateway

This is where the actual work happens. The gateway is what makes the rest of the stack safe.

What it does, in roughly the order each request flows through:

Parse the SQL. Use a real parser (libpg_query for Postgres, or its bindings). Reject DDL, DML, volatile functions outside an allowlist, references to off-allowlist tables/columns. Don't try this with regex.

Resolve the schema. Translate any unqualified table names against the allowlisted schemas. Reject references to schemas you don't expose.

Apply column masking. If a column is in the blocked list, refuse the query. If * is used, expand to the allowed columns and rewrite the query. The agent never sees the blocked columns, period.

Inject limits. Add LIMIT <default> if the query doesn't have one. Set statement_timeout on the connection.

EXPLAIN-cost dry run. Before actually running, ask Postgres EXPLAIN (FORMAT JSON) <query> and check the planner's cost estimate. If it's above a threshold, refuse with a "this query is too expensive" error.

Run. Against a read replica, with a connection from a pool that's specifically scoped to read-only.

Audit. Log the request, the parsed AST hash, the result size, the timestamp, the user/agent identity. Don't log the result content (it might be sensitive). Do log enough to reproduce the query.

Defang result content. For columns marked as user-content, wrap and truncate before returning. This is the prompt-injection mitigation.

That's the whole gateway. Most of it is parsing and policy enforcement. Almost none of it is database work.

This is the layer most teams skip when building this themselves, because it's the most code. It's also the layer that keeps you safe when everything else fails.

Layer 4: the read replica

The agent's connection should not point at your prod primary. It should point at a replica.

Why:

Agent queries are weirdly shaped. They sort tables that aren't indexed for that sort. They SELECT * columns the agent didn't realize are JSON blobs. They can degrade the database in ways your app's queries don't.
Agent traffic is unpredictable. A user asks one question and gets back a result, asks a follow-up that triggers a much bigger scan, then walks away. The replica eats that.
Agent failures shouldn't take down your app. Replication lag on a replica is recoverable. Replication lag from your primary is your business.

For Postgres, this is pg_basebackup, streaming replication, or one of the managed offerings (Aurora replicas, Neon branches, Supabase replicas). Even a daily-refreshed snapshot is better than the primary.

The role on the replica should be SELECT-only, scoped to the schemas you want exposed. This is the floor of read-only enforcement, with the parser-level enforcement above it as the actual policy.

Layer 5: prod Postgres

The agent does not touch this. Ever. The architecture treats prod as the source of truth that flows down to the replica through replication.

Prod's only job in this stack is to keep being the prod database. The agent's misbehavior, mistakes, and prompt-injection attacks all happen below prod, blocked by the layers above.

This is unsexy and important. The reason this layered design works is exactly that prod is unreachable from the agent path. If the architecture lets the agent reach prod, you've collapsed the layers and you're back to "the agent's blast radius is whatever the database role can do."

What you actually have to build vs. what you can buy

If you're starting from zero, here's the honest split:

You build:

The MCP client side, if you're shipping a custom agent. If you're using Claude Desktop / Code / Cursor / ChatGPT, you skip this.
The integration with your replication topology. Whatever your hosting provider supports.

You buy or use a tool for:

The MCP server.
The gateway. This is the painful one to build yourself. It's where most of the value of QueryBear lives.
The audit log + UI for non-engineers.

You configure:

The role grants on the replica.
The allowlist (tables, columns, functions).
The row/timeout/cost limits.
The user permissions inside the gateway.

A small team trying to build all of this internally tends to take 2-4 months and ship a v1 that's missing the parser. A small team using a tool for the gateway typically gets to a working setup in a day.

Where I see teams get this wrong

Three patterns, in order of frequency:

"We set the role to read-only, we're fine." No. Read-only roles don't stop exfiltration of sensitive columns, don't bound query cost, don't catch prompt injection in result content, don't generate an audit log. Read-only is the floor, not the policy.

"We give the agent a per-user database connection." Sometimes appealing because it lets you reuse Postgres roles for permissioning. Falls apart fast. You can't easily express "this column hidden from this user" in pure SQL. You can't easily inject row filters dynamically. You end up writing a permission layer in the application anyway, just outside the gateway. Put it in the gateway.

"We don't need an audit log because we trust the agent." Audit log isn't about distrust, it's about reproducibility. When something looks weird six weeks later — a metric that drifted, a customer who claims to have seen data they shouldn't have — the audit log is what lets you actually answer the question. Skip it and you're guessing.

The 2026 minimum bar for shipping

If you're going to put any AI agent in front of any real database, you should have at minimum:

A read replica (or at worst a daily-refreshed copy).
A read-only role on that replica scoped to the schemas you want.
A gateway that parses the SQL and enforces an allowlist.
Default row limits and a statement timeout.
An audit log of every query and its result size.
Asymmetric tool-call confirmation in the client (read auto-invokes, writes confirm).

Anything less and you're shipping a known vulnerability. The reference Postgres MCP has been archived for exactly this reason.

If you have these six layers, the worst case for an agent gone wrong is "a confused query and a noisy audit log entry." That's the bar.

Where this is going

The pieces of this stack are moving fast. The protocol layer (MCP) is stable. The gateway layer is where most of the innovation is happening — including, increasingly, in the form of products like QueryBear that exist specifically because the gateway is the part most teams don't want to build themselves.

What I expect to be true in 12 months:

Per-tenant gateways will be standard. Multi-tenant SaaS that exposes its own data to its customers' agents will all have this shape.
The "thin reference MCP server" pattern will be considered an anti-pattern, not a starting point.
Schema-aware MCP clients will become the default. The agent won't pre-fetch schema; it'll get a tool that hands back exactly the columns it asked about.
Prompt-injection-via-database-content will be a documented attack class with documented mitigations. Most products will ship some version of "wrap user-content fields" by default.

The companies that build their AI database access cleanly today will mostly look the same in 12 months, just with more confidence. The ones that hand their agents prod URLs are going to read about themselves in a postmortem.

The ask

If you've built any version of this stack at your company, write about what you did. The patterns that work need to circulate faster than they are. Most of what I've put in this post I learned the hard way, by watching things break in production at companies that didn't write about them after.

If you don't want to build the gateway yourself, that's the lane I'm in. But the rest of the stack — the replica, the role, the policy, the client confirmation, the audit log — every team has to do for themselves.

Get it right and your AI features stop being demos. They start being real product surface that you can stand behind.

7 comments

engineering_todayMay 4, 2026
Saving this as the canonical reference for the next time someone in eng leadership asks 'should we do this?' The seven-layer-stack diagram alone is worth bookmarking.
joel_pgsqlMay 4, 2026
Layer 4 (read replica) is the one most teams will skip and regret. 'It's just a SELECT' is famous last words against an unindexed sort on a billion-row table.
mcp_dabblerMay 4, 2026
Asymmetric tool-call confirmation in the client is probably the under-emphasized point. Most clients ship 'auto-invoke everything' and that's where things go sideways the fastest.
the_alexMay 5, 2026
I built four of the seven layers internally over six months at my last company. Wish this post had existed then. I would have used QueryBear.
tjones_dbaMay 5, 2026
The minimum bar at the end is correct. Anything less is shipping a known vulnerability — and after the Anthropic Postgres MCP archive, the 'we just used the reference' excuse doesn't hold.
rachelk_opsMay 5, 2026
Genuinely the best post in this category I've read. Forwarded to my whole team.
skeptical_dbaMay 6, 2026
Pillar piece, agreed, but the build-vs-buy section glosses over the maintenance burden of self-built gateways. We tried; it became its own SaaS-shaped project. Lessons learned.