Reference architecture · Production

Internal MCP AI tool platform

A production platform that gives employees and applications standardized, governed access to LLM-powered tools across operations, safety, and HR — with natural-language access to structured enterprise data and a pattern extensible to multi-agent SQL orchestration. The diagram below is interactive; the sections that follow cover a real security finding and the foundational skill template that established the reference pattern other teams now build on.

Security finding · production

prompt-injection in a tagged input

During a real workload — an internal request to pull employee roster data through one of the operational tools — the platform surfaced a prompt-injection vector in a tagging pattern embedded in user input. The finding led to formalized input-handling guardrails across every tool surface, not just the affected one.

The shape of the issue was familiar in hindsight, less obvious at the time. The user input contained text that resembled a structured tag pattern the LLM had been trained to interpret as a system instruction. Inside a well-scoped tool, the model treated the tag as a directive rather than as content — narrowing or widening the effective scope of the operation in ways the tool author had not intended.

The fix was defense in depth, not a one-line patch. Tag-shaped substrings are stripped from any text that flows into a model prompt, with the originals preserved separately for the audit trail. Tool inputs are validated against an explicit allow-list of shapes before the LLM is invoked, so non-conforming input is rejected at the boundary rather than rationalized inside the model. Every tool now ships with a per-tenant policy for what counts as malformed, instrumented as a first-class metric so false-positive rates stay visible.

The lesson generalizes. Prompt-injection in agentic systems is not a model-side problem — the model is doing what it was trained to do. It is an input-handling problem at the application boundary, and it is solved with the same discipline you apply to SQL injection, command injection, or XSS: never trust input, validate at the edge, fail closed, and instrument the rejection rate so a regression is visible immediately.

Schema-first MCP tool definition (illustrative, sanitized) python

from mcp.server import Server
from pydantic import BaseModel, Field, field_validator

class RosterLookupInput(BaseModel):
    """Schema-first contract — reviewed in code review,
    validated at the tool boundary, never trusted from the model."""

    location_code: str = Field(
        pattern=r"^[A-Z]{3,4}$",
        description="Three or four-letter location code, uppercase.",
    )
    role_filter: list[str] | None = Field(
        default=None,
        max_length=8,
        description="Optional role filter; allow-listed values only.",
    )

    @field_validator("role_filter")
    @classmethod
    def reject_tag_shapes(cls, v: list[str] | None):
        if not v:
            return v
        # Defense in depth: reject anything that looks like a
        # structured tag the model might interpret as a directive.
        for item in v:
            if "<" in item or ">" in item or "[INST]" in item.upper():
                raise ValueError("role_filter contains tag-shaped input")
        return v

mcp = Server("operations.roster")

@mcp.tool()
async def lookup_roster(args: RosterLookupInput, ctx) -> dict:
    """Tenant-scoped, audit-logged. AuthZ enforced at the boundary,
    not derived from the LLM's assertion of identity."""
    require_role(ctx.user, "operations.roster.read", tenant=ctx.tenant)
    return await roster_repo.lookup(
        tenant=ctx.tenant,
        location=args.location_code,
        roles=args.role_filter,
    )

Reference pattern · platform leverage

the foundational skill template

The platform’s velocity comes from a foundational skill template that other teams build on. New MCP capabilities scaffold from one repo, follow one set of conventions, and inherit the platform’s governance for free.

The template defines a small set of high-leverage decisions so that every team that ships an MCP capability does not have to make them again. Tools are defined schema-first — the JSON Schema (or OpenAPI) for inputs and outputs is authored before the implementation, so the contract is reviewable in code review independent of the code that satisfies it. Per-tenant authZ is wired into the scaffolding, not added later as a feature; tools that ignore tenant context fail validation. Idempotency keys are required on any state-changing tool; the GitHub Actions workflow checks for their presence at build time.

The CI/CD workflows are part of the template. A team that scaffolds a new skill inherits type-check, schema validation against the API of record, eval runs, and deployment to the platform — all as opinionated reusable workflows pinned to a major version. The same gates apply to every tool, which means the platform’s governance is enforced uniformly without anyone having to chase teams down to remember.

The pattern compounds. The first tool authored against the template took longer than it should have because the template was being built alongside it. The fifth team’s first tool shipped in two days. That velocity gap is the entire point — platforms that fail are usually platforms that did not invest in the reference pattern early enough.