Building a Production-Grade Postgres MCP Allowlist

If you're going to let an AI agent touch a real Postgres, the smallest thing you can build that actually keeps you safe is an allowlist. Not a database role — a parsed-SQL-against-a-list check, before the query runs.
This post is about how I'd build that today. It's roughly what QueryBear does internally, simplified down to the parts you'd actually need if you wrote it yourself.
What "allowlist" actually means here
The word means three different things at different levels of the stack:
- Database role grants. What
GRANT SELECT ON users TO agent_rolegives you. Useful but blunt — you can't say "this column yes, this column no" except via views or row-level security. - Schema visibility. What
search_pathandREVOKE ... FROM PUBLICshape. Fine, but agents can ignoresearch_pathby qualifying names. - Application-level allowlist. What your gateway enforces by parsing the SQL and refusing anything that references off-list objects. This is the level that actually maps to how you think about permissions.
Roles and schema visibility are the floor. The application-level allowlist is what does the work.
The data model
The smallest useful allowlist has four kinds of records:
tables— qualified names (public.users,analytics.events) that the agent canSELECT FROM.columns_blocked— qualified column names the agent can never see (public.users.password_hash,public.users.ssn_last_4).functions— qualified function names the agent can call (public.lower,public.coalesce). Empty by default, with a curated list of safe builtins.row_filters— per-table SQL fragments injected into everySELECTagainst that table. e.g. onsupport_tickets, forcetenant_id = $current_user.tenant_id.
That's the whole schema. ~100 rows of config covers a typical small app.
The parse step
The thing that separates "allowlist" from "wishful thinking" is a real SQL parser. You cannot do this with regex. You will get the regex wrong on day one and someone will figure out how to slip past it on day two.
For Postgres, your options are:
libpg_query. A C library extracted from Postgres's own parser, with bindings for almost every language. This is what I use.pgsql-ast-parser(TypeScript),pglast(Python),pg_query(Ruby). All wraplibpg_query.
What you do with the parsed AST:
- Walk every
RangeVar(table reference). For each, check it's in thetablesallowlist with the right schema. If not, reject. - Walk every
ColumnRef. For each, check that no part of the path matches acolumns_blockedentry. If a query selectsusers.password_hasheven via a wildcard, reject. - Walk every
FuncCall. For each, check the name resolves to an allowlisted function. Anything else, reject. This catchespg_read_file,dblink_*, custom functions with side effects, etc. - Reject all DML and DDL outright.
InsertStmt,UpdateStmt,DeleteStmt,CreateStmt, etc. — the agent doesn't issue these in your gateway. - Reject statements with a
WITH RECURSIVElonger than a configurable depth. Catches accidental infinite recursion. - Reject queries that include subqueries against
pg_catalog.*unless the user explicitly opted in. Most agents don't need to introspect catalog directly; you've already exposed schema metadata viaget_schema.
Each rejection is a 4xx-style error returned to the agent with a one-sentence reason. Good agents read the reason and adjust. Bad ones don't, and you've still kept them out.
The wildcard problem
SELECT * FROM users is the obvious headache. The user might have meant "every column the role can see," but that's exactly the column allowlist's job. The simplest thing is to refuse * outright and force the agent to enumerate columns. This sounds annoying. It is annoying. It is also worth it.
A nicer middle ground: rewrite SELECT * FROM t into SELECT col1, col2, col3 FROM t where col1..col3 is the allowed-column list for t. The model gets back its data. Blocked columns silently never make it into the model context. Costs a bit more parser work but avoids agent friction.
Row filters
A row filter is "this user can only see rows where tenant_id = X." You implement it by injecting an additional WHERE clause into every SELECT that touches the filtered table.
Three places this can live:
- Postgres row-level security (RLS). The native Postgres mechanism. Solid. Requires that you set per-session GUCs from the gateway with
SET app.current_user = '...'. Powerful, but tying agent identity to a Postgres session is more plumbing than it's worth for most setups. - Gateway-injected
WHEREclause. You addAND <filter>to the query before running it. Easier to maintain. Less foolproof — if you accidentally let a query through unparsed, the filter doesn't apply. - Materialized views. Pre-filter the data into views the agent can see. Performance is great. Refresh cost is real.
I usually recommend gateway-injected WHERE for small apps, RLS for anything regulated.
What an allowlist file actually looks like
Here's a real shape, slightly simplified, that we use as the config for a typical QueryBear-style gateway:
tables:
- public.users
- public.orders
- public.order_items
- public.products
- public.support_tickets
- public.support_messages
columns_blocked:
- public.users.password_hash
- public.users.password_reset_token
- public.users.totp_secret
- public.users.ssn_last_4
functions_allowed:
# date / number / string builtins; full list usually ~60 entries
- now
- date_trunc
- count
- sum
- avg
- lower
- coalesce
row_filters:
- table: public.support_tickets
filter: "tenant_id = current_setting('app.tenant_id')::uuid"
- table: public.support_messages
filter: "ticket_id IN (SELECT id FROM support_tickets WHERE tenant_id = current_setting('app.tenant_id')::uuid)"
limits:
default_row_limit: 1000
statement_timeout_ms: 5000
max_explain_cost: 100000Three readable, hand-maintainable kinds of records. No magic. The output of every parse against this is "yes, run it" or "no, here's why."
The agent-facing surface
Here's the trick that makes the whole thing usable by agents: the agent doesn't see the allowlist directly, but the schema you expose to it is filtered through the allowlist.
When the agent calls get_schema, you return:
- Only the tables in the
tablesallowlist. - For each table, only the columns not in
columns_blocked. - An annotation noting that
WHERE tenant_id = ?is already applied to relevant tables.
The agent's mental model becomes "these are the tables and columns I can use." It writes queries against that model, the parser rejects the rare violation, and most of the time the agent stays inside the lines because the lines are visible.
This part is critical. If you tell the agent "you can read everything" but then reject queries, the agent thrashes — generating, getting rejected, retrying, getting rejected. If you tell the agent "here's exactly what's available," it writes queries that work the first time.
What you gain over a database role
Could you do all of this with grants, views, RLS, and REVOKE? In theory yes. In practice no, for two reasons.
One: granular column privacy in vanilla Postgres requires either column-level grants (which exist and are clunky) or one view per access pattern (which is a lot of views). Application-level allowlists are easier to reason about and easier to audit.
Two: most of the value of an allowlist is the part that isn't expressible as a grant — the function-call allowlist, the row-limit injection, the statement-timeout enforcement, the audit log of which queries hit. None of those are SQL primitives.
So the database role is the floor. Make it tight. The allowlist is the part that does the actual permission work.
What this looks like in QueryBear
QueryBear's gateway is a more featured version of the above. The config lives in a UI rather than a YAML file, the parser is libpg_query-backed, and the audit log is a database table per workspace. The thing the MCP-connected agent sees is identical in shape to a vanilla Postgres MCP, except every query goes through this stack first. About 95% of "I need an allowlist" implementations end up with this shape eventually.
The ones who write it themselves get to learn libpg_query. The ones who don't, use ours. Either way, the shape is the same. Just please put a parser in front of your AI's database connection. Read-only-mode is not it.
4 comments
- engineering_today
The wildcard rewrite trick (`SELECT *` → `SELECT col1, col2, ...`) is great. Avoids the 'agent thrashes against rejection' failure mode while keeping column blocking enforced.
- tjones_dba
RLS for regulated workloads is the correct call. Gateway-injected WHERE is fine for SaaS but compliance auditors want it at the database layer.
- wfh_dev
100-row YAML file vs writing a permission system in code — strongly endorse the YAML approach. Auditing is so much easier when the policy is hand-readable.
- skeptical_dba
The function allowlist is the part most people skip. They allowlist tables and columns and forget that `SELECT pg_read_file(...)` will happily exfiltrate anything the postgres user can read.