Build a ChatGPT App with the OpenAI Apps SDK and MCP in TypeScript

Monday 15/06/2026

·11 min read
Share:

You've seen the demos: someone types a request into ChatGPT and a real, interactive UI appears inline — a map, a playlist, a booking form — not a blob of markdown. That's an Apps SDK app, and the moment you try to build one yourself you hit a wall. The OpenAI Apps SDK ships official TypeScript templates and almost no explanation of how the pieces talk to each other. Where does the UI live? How does your React component get the tool's data? How does a button click inside the widget run a tool back on your server?

This post answers exactly that. We'll build a working ChatGPT app with the OpenAI Apps SDK and MCP in TypeScript — a "book a meeting room" app with a calendar widget — and I'll be explicit about the part everyone gets stuck on: the MCP-to-iframe bridge.

If you've never built an MCP server, skim How to Build an MCP Server in TypeScript from Scratch first. This post assumes you know what a tool and a resource are.

How an Apps SDK app actually works

Here's the mental model that took me too long to figure out. An Apps SDK app is two things glued together:

  1. An MCP server — plain old Model Context Protocol. It exposes tools (functions the model calls) and resources (static assets). Nothing ChatGPT-specific so far.
  2. A widget — an HTML+JS bundle, served as an MCP resource, that ChatGPT renders in a sandboxed <iframe>. Inside that iframe you get a window.openai global that bridges back to ChatGPT over JSON-RPC on top of postMessage.

The glue is _meta. When a tool returns its result, you attach a _meta field that says "render this result with that widget." ChatGPT loads the widget HTML, hands it your tool's structuredContent, and your React app renders. When the user clicks a button in the widget, you call window.openai.callTool(...), which runs another tool on your server.

That's the whole architecture. Tools on the server, UI in an iframe, window.openai connecting them.

Project setup

mkdir room-booking-app && cd room-booking-app
pnpm init
pnpm add @modelcontextprotocol/sdk zod
pnpm add -D typescript tsx esbuild react react-dom @types/react @types/react-dom

We use the official MCP SDK for the server, esbuild to bundle the widget into a single file, and React for the UI. Two build targets: the Node server and the browser widget.

The widget: a React calendar

Start with the UI, because it defines the data contract. The widget reads window.openai.toolOutput — that's the structuredContent your tool returned — and renders it.

// src/widget/RoomBooking.tsx
import { useEffect, useState } from 'react'

interface Slot {
    id: string
    room: string
    time: string
    available: boolean
}

interface ToolOutput {
    date: string
    slots: Slot[]
}

// The bridge ChatGPT injects into the iframe.
interface OpenAiGlobals {
    toolOutput: ToolOutput | null
    widgetState: { selectedSlotId?: string } | null
    callTool: (name: string, args: Record<string, unknown>) => Promise<unknown>
    setWidgetState: (state: Record<string, unknown>) => Promise<void>
    sendFollowUpMessage: (args: { prompt: string }) => Promise<void>
}

declare global {
    interface Window {
        openai: OpenAiGlobals
    }
}

export function RoomBooking() {
    const [output, setOutput] = useState<ToolOutput | null>(
        window.openai.toolOutput
    )
    const [selected, setSelected] = useState<string | undefined>(
        window.openai.widgetState?.selectedSlotId
    )
    const [booking, setBooking] = useState(false)
    const [error, setError] = useState<string | null>(null)

    // ChatGPT pushes new data via a custom event whenever globals change.
    useEffect(() => {
        const onUpdate = () => setOutput(window.openai.toolOutput)
        window.addEventListener('openai:set_globals', onUpdate)
        return () => window.removeEventListener('openai:set_globals', onUpdate)
    }, [])

    if (!output) return <p>Loading availability…</p>

    async function book(slot: Slot) {
        setBooking(true)
        setError(null)
        try {
            // Run a tool back on the MCP server.
            await window.openai.callTool('book_room', { slotId: slot.id })
            setSelected(slot.id)
            // Persist selection so it survives a re-render or reload.
            await window.openai.setWidgetState({ selectedSlotId: slot.id })
            // Nudge the conversation forward in natural language.
            await window.openai.sendFollowUpMessage({
                prompt: `Booked ${slot.room} at ${slot.time}. Confirm it for me.`,
            })
        } catch (e) {
            setError(e instanceof Error ? e.message : 'Booking failed')
        } finally {
            setBooking(false)
        }
    }

    return (
        <div className="grid">
            <h2>Rooms for {output.date}</h2>
            {output.slots.map((slot) => (
                <button
                    key={slot.id}
                    disabled={!slot.available || booking}
                    aria-pressed={selected === slot.id}
                    onClick={() => book(slot)}
                >
                    {slot.room} — {slot.time}
                    {!slot.available && ' (taken)'}
                    {selected === slot.id && ' ✓'}
                </button>
            ))}
            {error && <p role="alert">{error}</p>}
        </div>
    )
}

Three window.openai calls do all the work: callTool runs server logic, setWidgetState persists UI state across re-renders, and sendFollowUpMessage writes a message back into the conversation so the model can keep going. Notice the error handling — callTool rejects on a failed tool, and you want that surfaced in the UI, not swallowed.

The mount point:

// src/widget/index.tsx
import { createRoot } from 'react-dom/client'
import { RoomBooking } from './RoomBooking'

const el = document.getElementById('root')
if (el) createRoot(el).render(<RoomBooking />)

Bundle it to a single IIFE file so it can be inlined into the resource:

pnpm esbuild src/widget/index.tsx \
  --bundle --format=iife --minify \
  --outfile=dist/widget.js

The MCP server

Now the backend. We define two tools — list_rooms (returns availability and renders the widget) and book_room (the action the widget calls back) — and one resource (the widget HTML).

// src/server.ts
import { readFileSync } from 'node:fs'
import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js'
import { z } from 'zod'

const WIDGET_URI = 'ui://widget/room-booking.html'

// Inline the bundled JS into a minimal HTML shell.
const widgetJs = readFileSync('dist/widget.js', 'utf8')
const widgetHtml = `<!DOCTYPE html>
<html>
  <head><meta charset="utf-8" /></head>
  <body><div id="root"></div><script>${widgetJs}</script></body>
</html>`

const server = new McpServer({ name: 'room-booking', version: '1.0.0' })

Register the widget as a resource

The widget is an MCP resource with the special text/html+skybridge mime type. That mime type is the signal that tells ChatGPT "this resource is a renderable Apps SDK widget," not a plain document.

// src/server.ts (continued)
server.registerResource(
    'room-booking-widget',
    WIDGET_URI,
    {},
    async () => ({
        contents: [
            {
                uri: WIDGET_URI,
                mimeType: 'text/html+skybridge',
                text: widgetHtml,
            },
        ],
        _meta: {
            // Shown to the model so it knows what the widget is for.
            'openai/widgetDescription':
                'Interactive meeting-room availability calendar.',
            // Lock down what the iframe can load. Be strict.
            'openai/widgetCSP': {
                connect_domains: [],
                resource_domains: [],
            },
        },
    })
)

The openai/widgetCSP block is not optional in practice — without a content security policy your widget can be rejected, and a tight policy is your first line of defense against a compromised dependency phoning home. We allow nothing external here because the widget is fully self-contained.

The tool that renders the widget

This is where _meta does its job. The tool returns structuredContent (the data the widget reads as toolOutput) and points openai/outputTemplate at the resource URI.

// src/server.ts (continued)
import { getAvailability, reserve } from './rooms.js'

server.registerTool(
    'list_rooms',
    {
        title: 'List available meeting rooms',
        description: 'Show bookable meeting rooms for a given date.',
        inputSchema: { date: z.string().describe('ISO date, e.g. 2026-06-20') },
        _meta: {
            // Bind this tool's result to the widget.
            'openai/outputTemplate': WIDGET_URI,
            // Status strings ChatGPT shows while the tool runs.
            'openai/toolInvocation/invoking': 'Checking room availability…',
            'openai/toolInvocation/invoked': 'Loaded room availability',
        },
    },
    async ({ date }) => {
        const slots = await getAvailability(date)
        return {
            // Text the model reads. Keep it short — the widget is the real UI.
            content: [
                { type: 'text', text: `Found ${slots.length} rooms for ${date}.` },
            ],
            // The widget reads this as window.openai.toolOutput.
            structuredContent: { date, slots },
        }
    }
)

Two payloads come out of one tool. content is for the model (so it can reason and narrate); structuredContent is for the widget. Keep the text terse — duplicating the whole calendar into content just burns tokens and risks the model contradicting the UI.

The action tool

When the widget calls window.openai.callTool('book_room', ...), it lands here. This tool has no widget — it just mutates state and returns a confirmation.

// src/server.ts (continued)
server.registerTool(
    'book_room',
    {
        title: 'Book a meeting room',
        description: 'Reserve a specific room slot by id.',
        inputSchema: { slotId: z.string() },
        _meta: {
            // Allow the widget itself to invoke this tool.
            'openai/widgetAccessible': true,
        },
    },
    async ({ slotId }) => {
        try {
            const confirmation = await reserve(slotId)
            return {
                content: [
                    { type: 'text', text: `Reserved. Confirmation: ${confirmation.id}` },
                ],
                structuredContent: { confirmationId: confirmation.id },
            }
        } catch (e) {
            const msg = e instanceof Error ? e.message : 'Reservation failed'
            return {
                isError: true,
                content: [{ type: 'text', text: msg }],
            }
        }
    }
)

The openai/widgetAccessible: true flag is a gotcha that cost me an hour. By default a widget cannot call arbitrary tools — only ones explicitly marked accessible. Forget it and callTool rejects with a permission error that looks like a transport bug. Also note isError: true: when the reservation fails, that's what makes callTool reject in the widget, which lets our catch block show the error.

The data layer

Real logic, not placeholders — swap the in-memory store for your database.

// src/rooms.ts
interface Slot {
    id: string
    room: string
    time: string
    available: boolean
}

const DB: Record<string, Slot[]> = {}

function seed(date: string): Slot[] {
    const rooms = ['Oslo', 'Lisbon', 'Tokyo']
    const times = ['09:00', '11:00', '14:00']
    return rooms.flatMap((room, r) =>
        times.map((time, t) => ({
            id: `${date}-${r}-${t}`,
            room,
            time,
            available: true,
        }))
    )
}

export async function getAvailability(date: string): Promise<Slot[]> {
    if (!DB[date]) DB[date] = seed(date)
    return DB[date]
}

export async function reserve(slotId: string): Promise<{ id: string }> {
    const date = slotId.split('-').slice(0, 3).join('-')
    const slot = DB[date]?.find((s) => s.id === slotId)
    if (!slot) throw new Error(`Unknown slot: ${slotId}`)
    if (!slot.available) throw new Error('That room was just taken')
    slot.available = false
    return { id: `CONF-${slotId}` }
}

Serving over HTTP

ChatGPT connects to remote apps over Streamable HTTP, not stdio. Wire the server to an HTTP transport:

// src/http.ts
import { createServer } from 'node:http'
import { StreamableHTTPServerTransport } from '@modelcontextprotocol/sdk/server/streamableHttp.js'
import { server } from './server.js'

const transport = new StreamableHTTPServerTransport({
    sessionIdGenerator: undefined, // stateless mode; fine for this demo
})

await server.connect(transport)

createServer((req, res) => {
    if (req.url === '/mcp') {
        transport.handleRequest(req, res)
    } else {
        res.writeHead(404).end()
    }
}).listen(3000, () => console.log('MCP app on http://localhost:3000/mcp'))

Export server from server.ts and run pnpm tsx src/http.ts.

Auth, briefly

The demo is open, but a real app booking real rooms needs auth. The Apps SDK uses OAuth 2.1: ChatGPT performs an authorization-code flow against your identity provider and sends a bearer token on every MCP request. In your tool handlers you validate that token and scope the data to the user. Don't trust slotId blindly — check the caller actually owns the calendar. Treat every tool input as adversarial; the model can be talked into passing things it shouldn't (see Prompt Injection Defense for JavaScript Apps).

Testing locally

Two ways, in order of speed:

  • MCP Inspectorpnpm dlx @modelcontextprotocol/inspector, point it at http://localhost:3000/mcp. It lists your tools and resources and lets you call them, so you can confirm the server contract before ChatGPT ever sees it. It won't render the widget, but it'll show you the structuredContent and _meta you're emitting.
  • ChatGPT developer mode — enable it in settings, add your server URL as a connector (use a tunnel like ngrok to expose localhost over HTTPS — ChatGPT will not connect to plain http://localhost). Now you can actually see the calendar render inline and click through a booking.

Iterate on the server in Inspector, then do the full visual loop in ChatGPT. Bouncing to ChatGPT for every change is slow.

Gotchas I hit

  • The widget can't see your server's environment. It runs in a sandboxed iframe on ChatGPT's origin. Anything it needs must come through toolOutput or a callTool round-trip — no fetch to your private API unless you allow that domain in openai/widgetCSP.
  • State is ephemeral unless you persist it. A re-render or a display-mode switch blows away component state. Use setWidgetState for anything that must survive, and read it back from window.openai.widgetState on mount.
  • structuredContent has a size budget. It's injected into the model context, so a 5,000-row table is a token bomb. Send a page of data and fetch more via callTool.
  • Bundle size matters. The widget JS is inlined into the resource. Keep dependencies lean — I dropped a date library and saved 40 KB that loaded on every render.

What's next

You now have a ChatGPT app that renders interactive UI and runs tools both ways. The natural next step is packaging the reusable instructions and helper scripts behind it so Claude (or any agent) can load them on demand instead of you hardcoding everything — that's Anthropic Agent Skills in TypeScript: Package Reusable Instructions and Code as Tools, coming next. If you want to go deeper on the UI side, Build Generative UI with Vercel AI SDK tackles the same "stream components, not markdown" idea from the other direction.

Share:
VA

Vadim Alakhverdov

Software developer writing about JavaScript, web development, and developer tools.

Related Posts