MCP Gateway

MCP Gateway's programmable layer between agents and MCP tool responses — user-authored Python scripts that cut token consumption by 81-96% in end-to-end testing.

Overview

MCP tools return unbounded responses. A single salesforce_opportunities_list call can dump 200KB+ of raw JSON into an agent's context window. Multiply that by the multi-step workflows agents actually run — five, ten, twenty tool calls per task — and the context window fills with data the agent never needed. This is the MCP token bloat problem, and it is the single largest obstacle to running MCP-based agents in production.

MCP Gateway proposes a solution: a scripting layer where skills contain user-authored Python scripts that intercept tool responses before the agent sees them. Instead of the agent receiving 200KB of raw Salesforce records, a script calls the API, filters to the five fields that matter, computes aggregates, and returns 800 bytes of structured insight. The agent made one tool call and got a compact result. It has zero awareness that a script was involved.

Skills are where developers work. A skill package contains guidance (SKILL.md), scripts, and tool-calling interception logic. The scripting layer is the mechanism inside skills that makes token reduction possible — Python scripts that run in sandboxed containers, call raw MCP tools through the gateway SDK, and return compact results. In end-to-end testing, this approach reduces token consumption by 81-96% compared to raw tool responses.

Script tools appear as normal MCP tools in the agent's tool list. The agent has zero awareness that a script is involved — it makes one tool call and gets a compact result.

How It Works

Agent calls a tool

The agent sees summarize_pipeline in its tool list and calls it like any other MCP tool. It does not know this is a script tool.

Gateway dispatches to sandbox

The gateway checks the tool's origin_type. For script tools, it creates a scoped temporary token and acquires a sandbox container.

Script executes with SDK

The Python script runs in the sandbox using the gateway SDK (mcpgateway-sdk). It calls raw MCP tools, filters and joins data, and compacts the result.

Gateway enforces constraints

Before returning to the agent, the gateway enforces max_response_bytes and validates against outputSchema. The scoped token is revoked.

Agent receives compact result

The agent gets 500-800 bytes instead of 200KB+. From its perspective, this was a normal tool call.

Two Paths

MCP Gateway offers two execution paths that coexist in the same skill package. The key distinction is who decides when the script runs and what constraints are enforced.

scripts/ folder

The agent reads SKILL.md, understands the workflow, and chooses to run a script via the sandbox exec endpoint.

Package structure:

my-skill/
├── SKILL.md          # Agent reads this for guidance
└── scripts/
    └── hn_digest.py  # Uses mcpgateway-sdk

Characteristics:

Aspect	Behavior
Who triggers	Agent decides (reads SKILL.md)
Scope	Wildcard — script can call any tool the user has access to
Visibility	Not in tools/list — agent must learn about it from SKILL.md
Enforcement	None — no schema validation, no response budgets
Best for	Prototyping, notebooks, flexible agent workflows

tools/ + workflow.json

Scripts are registered as first-class MCP tools with schemas, allowed-tool lists, response budgets, and timeouts. The gateway dispatches, validates, and enforces.

Package structure:

my-skill/
├── SKILL.md          # Agent guidance (required)
├── workflow.json     # Tool schemas, constraints
└── tools/
    └── summarize_pipeline.py

Characteristics:

Aspect	Behavior
Who triggers	Gateway dispatches on tool call
Scope	Scoped token — only `allowed_tools` from workflow.json
Visibility	Appears in tools/list and SEARCH_TOOLS as a normal tool
Enforcement	Full — input/output schema, response budget, timeout
Best for	Production, untrusted agents, curated tool surfaces

Upgrade path: Start with scripts/ to iterate quickly. When the workflow stabilizes, move to tools/ + workflow.json for production guardrails. Both paths live in the same skill package.

Gateway SDK

Scripts communicate with the gateway using mcpgateway-sdk, a thin async Python client that auto-configures from environment variables injected by the gateway. Instantiate MCPGateway() with no arguments — the SDK reads MCPGATEWAY_TOKEN and MCPGATEWAY_URL from the environment. Tool arguments from the agent arrive as a JSON string in MCPGATEWAY_ARGS.

import json
import os

from mcpgateway_sdk import MCPGateway

args = json.loads(os.environ.get("MCPGATEWAY_ARGS", "{}"))
gw = MCPGateway()  # reads MCPGATEWAY_TOKEN / MCPGATEWAY_URL from env

# Call a raw MCP tool by its qualified name — gateway handles credentials
result = await gw.tools.execute("salesforce__opportunities_list", {
    "query": {"q": "SELECT Id, Name, Amount FROM Opportunity"}
})

# Semantic search across all enabled tools
tools = await gw.tools.search("salesforce accounts", limit=5)

# Cache data for follow-up questions in the same session
await gw.cache.set("accounts_q1", result.result, ttl=600)
cached = await gw.cache.get("accounts_q1")

gw.tools.execute

Call raw MCP tools through the gateway.

gw.tools.search

Semantic search across enabled tools.

gw.cache.get

Read from gateway cache.

gw.cache.set

Write to gateway cache with TTL.

Exposure Modes

Exposure modes control which types of tools the agent can see, per server.

Mode	Eligible Tools	Use Case
`all`	REST tools + script tools	Default, development
`raw`	Only REST tools	Debugging raw API behavior
`script_only`	Only script tools	Production — raw tools hidden

With script_only, the agent cannot call raw REST tools like salesforce_opportunities_list (200KB response). The only way to get Salesforce data is through the curated script tool summarize_pipeline (800-byte response). The gateway enforces this — not the agent's good behavior.

Security

Control	How It Works
Scoped tokens	Each invocation gets a temporary API key scoped to `allowed_tools`. Auto-expires.
Response budgets	`max_response_bytes` enforced by gateway before agent sees results.
Schema validation	Input validated against `inputSchema`, output against `outputSchema`.
Sandbox isolation	Docker (dev) or Kubernetes (prod) containers. SDK uses internal channel.
Audit correlation	Every nested SDK call linked to parent via `parent_call_id`.
No raw credentials	Gateway resolves OAuth/API tokens. Scripts never see them.

Script Tools