Build a ChatGPT App with the OpenAI Apps SDK and MCP in TypeScript
Monday 15/06/2026
·11 min readYou've seen the demos: someone types a request into ChatGPT and a real, interactive UI appears inline — a map, a playlist, a booking form — not a blob of markdown. That's an Apps SDK app, and the moment you try to build one yourself you hit a wall. The OpenAI Apps SDK ships official TypeScript templates and almost no explanation of how the pieces talk to each other. Where does the UI live? How does your React component get the tool's data? How does a button click inside the widget run a tool back on your server?
This post answers exactly that. We'll build a working ChatGPT app with the OpenAI Apps SDK and MCP in TypeScript — a "book a meeting room" app with a calendar widget — and I'll be explicit about the part everyone gets stuck on: the MCP-to-iframe bridge.
If you've never built an MCP server, skim How to Build an MCP Server in TypeScript from Scratch first. This post assumes you know what a tool and a resource are.
How an Apps SDK app actually works
Here's the mental model that took me too long to figure out. An Apps SDK app is two things glued together:
- An MCP server — plain old Model Context Protocol. It exposes tools (functions the model calls) and resources (static assets). Nothing ChatGPT-specific so far.
- A widget — an HTML+JS bundle, served as an MCP resource, that ChatGPT renders in a sandboxed
<iframe>. Inside that iframe you get awindow.openaiglobal that bridges back to ChatGPT over JSON-RPC on top ofpostMessage.
The glue is _meta. When a tool returns its result, you attach a _meta field that says "render this result with that widget." ChatGPT loads the widget HTML, hands it your tool's structuredContent, and your React app renders. When the user clicks a button in the widget, you call window.openai.callTool(...), which runs another tool on your server.
That's the whole architecture. Tools on the server, UI in an iframe, window.openai connecting them.
Project setup
mkdir room-booking-app && cd room-booking-app
pnpm init
pnpm add @modelcontextprotocol/sdk zod
pnpm add -D typescript tsx esbuild react react-dom @types/react @types/react-dom
We use the official MCP SDK for the server, esbuild to bundle the widget into a single file, and React for the UI. Two build targets: the Node server and the browser widget.
The widget: a React calendar
Start with the UI, because it defines the data contract. The widget reads window.openai.toolOutput — that's the structuredContent your tool returned — and renders it.
// src/widget/RoomBooking.tsx
import { useEffect, useState } from 'react'
interface Slot {
id: string
room: string
time: string
available: boolean
}
interface ToolOutput {
date: string
slots: Slot[]
}
// The bridge ChatGPT injects into the iframe.
interface OpenAiGlobals {
toolOutput: ToolOutput | null
widgetState: { selectedSlotId?: string } | null
callTool: (name: string, args: Record<string, unknown>) => Promise<unknown>
setWidgetState: (state: Record<string, unknown>) => Promise<void>
sendFollowUpMessage: (args: { prompt: string }) => Promise<void>
}
declare global {
interface Window {
openai: OpenAiGlobals
}
}
export function RoomBooking() {
const [output, setOutput] = useState<ToolOutput | null>(
window.openai.toolOutput
)
const [selected, setSelected] = useState<string | undefined>(
window.openai.widgetState?.selectedSlotId
)
const [booking, setBooking] = useState(false)
const [error, setError] = useState<string | null>(null)
// ChatGPT pushes new data via a custom event whenever globals change.
useEffect(() => {
const onUpdate = () => setOutput(window.openai.toolOutput)
window.addEventListener('openai:set_globals', onUpdate)
return () => window.removeEventListener('openai:set_globals', onUpdate)
}, [])
if (!output) return <p>Loading availability…</p>
async function book(slot: Slot) {
setBooking(true)
setError(null)
try {
// Run a tool back on the MCP server.
await window.openai.callTool('book_room', { slotId: slot.id })
setSelected(slot.id)
// Persist selection so it survives a re-render or reload.
await window.openai.setWidgetState({ selectedSlotId: slot.id })
// Nudge the conversation forward in natural language.
await window.openai.sendFollowUpMessage({
prompt: `Booked ${slot.room} at ${slot.time}. Confirm it for me.`,
})
} catch (e) {
setError(e instanceof Error ? e.message : 'Booking failed')
} finally {
setBooking(false)
}
}
return (
<div className="grid">
<h2>Rooms for {output.date}</h2>
{output.slots.map((slot) => (
<button
key={slot.id}
disabled={!slot.available || booking}
aria-pressed={selected === slot.id}
onClick={() => book(slot)}
>
{slot.room} — {slot.time}
{!slot.available && ' (taken)'}
{selected === slot.id && ' ✓'}
</button>
))}
{error && <p role="alert">{error}</p>}
</div>
)
}
Three window.openai calls do all the work: callTool runs server logic, setWidgetState persists UI state across re-renders, and sendFollowUpMessage writes a message back into the conversation so the model can keep going. Notice the error handling — callTool rejects on a failed tool, and you want that surfaced in the UI, not swallowed.
The mount point:
// src/widget/index.tsx
import { createRoot } from 'react-dom/client'
import { RoomBooking } from './RoomBooking'
const el = document.getElementById('root')
if (el) createRoot(el).render(<RoomBooking />)
Bundle it to a single IIFE file so it can be inlined into the resource:
pnpm esbuild src/widget/index.tsx \
--bundle --format=iife --minify \
--outfile=dist/widget.js
The MCP server
Now the backend. We define two tools — list_rooms (returns availability and renders the widget) and book_room (the action the widget calls back) — and one resource (the widget HTML).
// src/server.ts
import { readFileSync } from 'node:fs'
import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js'
import { z } from 'zod'
const WIDGET_URI = 'ui://widget/room-booking.html'
// Inline the bundled JS into a minimal HTML shell.
const widgetJs = readFileSync('dist/widget.js', 'utf8')
const widgetHtml = `<!DOCTYPE html>
<html>
<head><meta charset="utf-8" /></head>
<body><div id="root"></div><script>${widgetJs}</script></body>
</html>`
const server = new McpServer({ name: 'room-booking', version: '1.0.0' })
Register the widget as a resource
The widget is an MCP resource with the special text/html+skybridge mime type. That mime type is the signal that tells ChatGPT "this resource is a renderable Apps SDK widget," not a plain document.
// src/server.ts (continued)
server.registerResource(
'room-booking-widget',
WIDGET_URI,
{},
async () => ({
contents: [
{
uri: WIDGET_URI,
mimeType: 'text/html+skybridge',
text: widgetHtml,
},
],
_meta: {
// Shown to the model so it knows what the widget is for.
'openai/widgetDescription':
'Interactive meeting-room availability calendar.',
// Lock down what the iframe can load. Be strict.
'openai/widgetCSP': {
connect_domains: [],
resource_domains: [],
},
},
})
)
The openai/widgetCSP block is not optional in practice — without a content security policy your widget can be rejected, and a tight policy is your first line of defense against a compromised dependency phoning home. We allow nothing external here because the widget is fully self-contained.
The tool that renders the widget
This is where _meta does its job. The tool returns structuredContent (the data the widget reads as toolOutput) and points openai/outputTemplate at the resource URI.
// src/server.ts (continued)
import { getAvailability, reserve } from './rooms.js'
server.registerTool(
'list_rooms',
{
title: 'List available meeting rooms',
description: 'Show bookable meeting rooms for a given date.',
inputSchema: { date: z.string().describe('ISO date, e.g. 2026-06-20') },
_meta: {
// Bind this tool's result to the widget.
'openai/outputTemplate': WIDGET_URI,
// Status strings ChatGPT shows while the tool runs.
'openai/toolInvocation/invoking': 'Checking room availability…',
'openai/toolInvocation/invoked': 'Loaded room availability',
},
},
async ({ date }) => {
const slots = await getAvailability(date)
return {
// Text the model reads. Keep it short — the widget is the real UI.
content: [
{ type: 'text', text: `Found ${slots.length} rooms for ${date}.` },
],
// The widget reads this as window.openai.toolOutput.
structuredContent: { date, slots },
}
}
)
Two payloads come out of one tool. content is for the model (so it can reason and narrate); structuredContent is for the widget. Keep the text terse — duplicating the whole calendar into content just burns tokens and risks the model contradicting the UI.
The action tool
When the widget calls window.openai.callTool('book_room', ...), it lands here. This tool has no widget — it just mutates state and returns a confirmation.
// src/server.ts (continued)
server.registerTool(
'book_room',
{
title: 'Book a meeting room',
description: 'Reserve a specific room slot by id.',
inputSchema: { slotId: z.string() },
_meta: {
// Allow the widget itself to invoke this tool.
'openai/widgetAccessible': true,
},
},
async ({ slotId }) => {
try {
const confirmation = await reserve(slotId)
return {
content: [
{ type: 'text', text: `Reserved. Confirmation: ${confirmation.id}` },
],
structuredContent: { confirmationId: confirmation.id },
}
} catch (e) {
const msg = e instanceof Error ? e.message : 'Reservation failed'
return {
isError: true,
content: [{ type: 'text', text: msg }],
}
}
}
)
The openai/widgetAccessible: true flag is a gotcha that cost me an hour. By default a widget cannot call arbitrary tools — only ones explicitly marked accessible. Forget it and callTool rejects with a permission error that looks like a transport bug. Also note isError: true: when the reservation fails, that's what makes callTool reject in the widget, which lets our catch block show the error.
The data layer
Real logic, not placeholders — swap the in-memory store for your database.
// src/rooms.ts
interface Slot {
id: string
room: string
time: string
available: boolean
}
const DB: Record<string, Slot[]> = {}
function seed(date: string): Slot[] {
const rooms = ['Oslo', 'Lisbon', 'Tokyo']
const times = ['09:00', '11:00', '14:00']
return rooms.flatMap((room, r) =>
times.map((time, t) => ({
id: `${date}-${r}-${t}`,
room,
time,
available: true,
}))
)
}
export async function getAvailability(date: string): Promise<Slot[]> {
if (!DB[date]) DB[date] = seed(date)
return DB[date]
}
export async function reserve(slotId: string): Promise<{ id: string }> {
const date = slotId.split('-').slice(0, 3).join('-')
const slot = DB[date]?.find((s) => s.id === slotId)
if (!slot) throw new Error(`Unknown slot: ${slotId}`)
if (!slot.available) throw new Error('That room was just taken')
slot.available = false
return { id: `CONF-${slotId}` }
}
Serving over HTTP
ChatGPT connects to remote apps over Streamable HTTP, not stdio. Wire the server to an HTTP transport:
// src/http.ts
import { createServer } from 'node:http'
import { StreamableHTTPServerTransport } from '@modelcontextprotocol/sdk/server/streamableHttp.js'
import { server } from './server.js'
const transport = new StreamableHTTPServerTransport({
sessionIdGenerator: undefined, // stateless mode; fine for this demo
})
await server.connect(transport)
createServer((req, res) => {
if (req.url === '/mcp') {
transport.handleRequest(req, res)
} else {
res.writeHead(404).end()
}
}).listen(3000, () => console.log('MCP app on http://localhost:3000/mcp'))
Export server from server.ts and run pnpm tsx src/http.ts.
Auth, briefly
The demo is open, but a real app booking real rooms needs auth. The Apps SDK uses OAuth 2.1: ChatGPT performs an authorization-code flow against your identity provider and sends a bearer token on every MCP request. In your tool handlers you validate that token and scope the data to the user. Don't trust slotId blindly — check the caller actually owns the calendar. Treat every tool input as adversarial; the model can be talked into passing things it shouldn't (see Prompt Injection Defense for JavaScript Apps).
Testing locally
Two ways, in order of speed:
- MCP Inspector —
pnpm dlx @modelcontextprotocol/inspector, point it athttp://localhost:3000/mcp. It lists your tools and resources and lets you call them, so you can confirm the server contract before ChatGPT ever sees it. It won't render the widget, but it'll show you thestructuredContentand_metayou're emitting. - ChatGPT developer mode — enable it in settings, add your server URL as a connector (use a tunnel like
ngrokto expose localhost over HTTPS — ChatGPT will not connect to plainhttp://localhost). Now you can actually see the calendar render inline and click through a booking.
Iterate on the server in Inspector, then do the full visual loop in ChatGPT. Bouncing to ChatGPT for every change is slow.
Gotchas I hit
- The widget can't see your server's environment. It runs in a sandboxed iframe on ChatGPT's origin. Anything it needs must come through
toolOutputor acallToolround-trip — nofetchto your private API unless you allow that domain inopenai/widgetCSP. - State is ephemeral unless you persist it. A re-render or a display-mode switch blows away component state. Use
setWidgetStatefor anything that must survive, and read it back fromwindow.openai.widgetStateon mount. structuredContenthas a size budget. It's injected into the model context, so a 5,000-row table is a token bomb. Send a page of data and fetch more viacallTool.- Bundle size matters. The widget JS is inlined into the resource. Keep dependencies lean — I dropped a date library and saved 40 KB that loaded on every render.
What's next
You now have a ChatGPT app that renders interactive UI and runs tools both ways. The natural next step is packaging the reusable instructions and helper scripts behind it so Claude (or any agent) can load them on demand instead of you hardcoding everything — that's Anthropic Agent Skills in TypeScript: Package Reusable Instructions and Code as Tools, coming next. If you want to go deeper on the UI side, Build Generative UI with Vercel AI SDK tackles the same "stream components, not markdown" idea from the other direction.