How to Build an AI-Powered Autocomplete for Any Text Input
Friday 27/03/2026
·11 min readYou're typing in a text field and a faint suggestion appears ahead of your cursor — press Tab to accept, keep typing to ignore. GitHub Copilot made this interaction feel magical, and now your users expect it everywhere. But building AI-powered autocomplete for a regular textarea is surprisingly tricky: you need to manage streaming, debouncing, request cancellation, and ghost text rendering without turning your input into a laggy mess.
This tutorial builds a reusable React component that streams AI autocomplete suggestions as inline ghost text for any text input. It works with any LLM provider — we'll use the Vercel AI SDK for the streaming layer and Claude as the model, but swapping to OpenAI or a local model is a one-line change.
The architecture
Here's what happens on every keystroke (or rather, after a debounced pause):
- User stops typing for 300ms
- We send the current text + cursor position to an API route
- The API streams back a completion token-by-token
- Ghost text appears inline after the cursor
- User presses Tab to accept or keeps typing to dismiss
The key insight: we don't render the ghost text in a separate overlay. We use a single <div contentEditable> with a <span> for the suggestion styled with reduced opacity. This avoids the positioning nightmare of overlaying elements on a textarea.
pnpm install ai @ai-sdk/anthropic
The API route
The backend is a Next.js API route that takes the text before the cursor and streams a completion.
// src/pages/api/autocomplete.ts
import { streamText } from 'ai'
import { anthropic } from '@ai-sdk/anthropic'
import type { NextRequest } from 'next/server'
export const config = { runtime: 'edge' }
export default async function handler(req: NextRequest) {
if (req.method !== 'POST') {
return new Response('Method not allowed', { status: 405 })
}
const { textBeforeCursor, textAfterCursor, context } = await req.json() as {
textBeforeCursor: string
textAfterCursor: string
context?: string
}
if (!textBeforeCursor || textBeforeCursor.trim().length < 5) {
return new Response('', { status: 200 })
}
const result = streamText({
model: anthropic('claude-sonnet-4-5-20250514'),
maxTokens: 60,
temperature: 0.3,
system: `You are an inline text autocomplete engine. Given the text before and after the cursor, predict what the user is about to type next. Rules:
- Output ONLY the predicted text, nothing else
- Keep suggestions short (1-2 sentences max)
- Match the tone and style of the existing text
- If the text after the cursor already continues naturally, return empty
- Do not repeat text that already exists before the cursor
${context ? `Context about this text field: ${context}` : ''}`,
prompt: `Text before cursor: "${textBeforeCursor}"${textAfterCursor ? `\nText after cursor: "${textAfterCursor}"` : ''}`,
})
return result.toDataStreamResponse()
}
A few things worth noting:
maxTokens: 60keeps suggestions short. Nobody wants a wall of ghost text.temperature: 0.3makes completions predictable. Autocomplete should feel obvious, not creative.- We return early if the text is too short — no point suggesting after "Hi".
- The
contextparameter lets you hint at what the field is for (e.g., "This is a customer support email reply").
The ghost text hook
This is where the real complexity lives. We need to manage debouncing, AbortController for cancellation, and the streaming state.
// src/hooks/useAutocomplete.ts
import { useCallback, useRef, useState } from 'react'
interface AutocompleteOptions {
debounceMs?: number
context?: string
enabled?: boolean
}
interface AutocompleteState {
suggestion: string
isLoading: boolean
}
export function useAutocomplete(options: AutocompleteOptions = {}) {
const { debounceMs = 300, context, enabled = true } = options
const [state, setState] = useState<AutocompleteState>({
suggestion: '',
isLoading: false,
})
const abortControllerRef = useRef<AbortController | null>(null)
const debounceTimerRef = useRef<ReturnType<typeof setTimeout> | null>(null)
const cancelSuggestion = useCallback(() => {
if (debounceTimerRef.current) {
clearTimeout(debounceTimerRef.current)
debounceTimerRef.current = null
}
if (abortControllerRef.current) {
abortControllerRef.current.abort()
abortControllerRef.current = null
}
setState({ suggestion: '', isLoading: false })
}, [])
const requestSuggestion = useCallback(
(textBeforeCursor: string, textAfterCursor: string) => {
if (!enabled) return
// Cancel any in-flight request
cancelSuggestion()
debounceTimerRef.current = setTimeout(async () => {
const controller = new AbortController()
abortControllerRef.current = controller
setState((prev) => ({ ...prev, isLoading: true }))
try {
const response = await fetch('/api/autocomplete', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ textBeforeCursor, textAfterCursor, context }),
signal: controller.signal,
})
if (!response.ok || !response.body) {
setState({ suggestion: '', isLoading: false })
return
}
const reader = response.body.getReader()
const decoder = new TextDecoder()
let fullSuggestion = ''
while (true) {
const { done, value } = await reader.read()
if (done) break
const chunk = decoder.decode(value, { stream: true })
// Parse the Vercel AI SDK data stream format
const lines = chunk.split('\n')
for (const line of lines) {
// Text parts in the data stream start with '0:'
if (line.startsWith('0:')) {
const text = JSON.parse(line.slice(2)) as string
fullSuggestion += text
setState({ suggestion: fullSuggestion, isLoading: false })
}
}
}
} catch (err: unknown) {
if (err instanceof Error && err.name !== 'AbortError') {
console.error('Autocomplete error:', err)
}
setState({ suggestion: '', isLoading: false })
}
}, debounceMs)
},
[enabled, debounceMs, context, cancelSuggestion],
)
const acceptSuggestion = useCallback(() => {
const accepted = state.suggestion
setState({ suggestion: '', isLoading: false })
return accepted
}, [state.suggestion])
return {
suggestion: state.suggestion,
isLoading: state.isLoading,
requestSuggestion,
cancelSuggestion,
acceptSuggestion,
}
}
The critical pattern here is cancel-before-request. Every new keystroke cancels the previous debounce timer AND aborts any in-flight fetch. Without this, you get stale suggestions appearing after the user has moved on.
The autocomplete textarea component
Now we combine the hook with a contentEditable div that renders ghost text inline.
// src/components/AutocompleteTextarea.tsx
import React, { useRef, useCallback, KeyboardEvent, FormEvent } from 'react'
import { useAutocomplete } from '../hooks/useAutocomplete'
interface AutocompleteTextareaProps {
value: string
onChange: (value: string) => void
placeholder?: string
context?: string
className?: string
}
export function AutocompleteTextarea({
value,
onChange,
placeholder = 'Start typing...',
context,
className = '',
}: AutocompleteTextareaProps) {
const editableRef = useRef<HTMLDivElement>(null)
const {
suggestion,
isLoading,
requestSuggestion,
cancelSuggestion,
acceptSuggestion,
} = useAutocomplete({ context })
const getTextAroundCursor = useCallback((): {
before: string
after: string
} => {
const selection = window.getSelection()
if (!selection || !selection.rangeCount || !editableRef.current) {
return { before: value, after: '' }
}
const range = selection.getRangeAt(0)
const preRange = document.createRange()
preRange.setStart(editableRef.current, 0)
preRange.setEnd(range.startContainer, range.startOffset)
const before = preRange.toString()
const after = value.slice(before.length)
return { before, after }
}, [value])
const handleInput = useCallback(
(e: FormEvent<HTMLDivElement>) => {
const target = e.currentTarget
// Remove any ghost text spans before reading content
const ghost = target.querySelector('[data-ghost]')
if (ghost) ghost.remove()
const newValue = target.textContent ?? ''
onChange(newValue)
const { before, after } = getTextAroundCursor()
requestSuggestion(before, after)
},
[onChange, getTextAroundCursor, requestSuggestion],
)
const handleKeyDown = useCallback(
(e: KeyboardEvent<HTMLDivElement>) => {
if (e.key === 'Tab' && suggestion) {
e.preventDefault()
const accepted = acceptSuggestion()
const newValue = value + accepted
onChange(newValue)
// Update the contentEditable and place cursor at end
if (editableRef.current) {
editableRef.current.textContent = newValue
const range = document.createRange()
const selection = window.getSelection()
range.selectNodeContents(editableRef.current)
range.collapse(false)
selection?.removeAllRanges()
selection?.addRange(range)
}
}
if (e.key === 'Escape') {
cancelSuggestion()
}
},
[suggestion, acceptSuggestion, cancelSuggestion, value, onChange],
)
return (
<div className="relative">
<div
ref={editableRef}
contentEditable
suppressContentEditableWarning
role="textbox"
aria-placeholder={placeholder}
aria-label="Text input with AI autocomplete"
className={`min-h-[120px] p-3 border rounded-lg focus:outline-none focus:ring-2
focus:ring-blue-500 whitespace-pre-wrap break-words ${className}`}
onInput={handleInput}
onKeyDown={handleKeyDown}
onBlur={cancelSuggestion}
/>
{suggestion && (
<GhostOverlay
containerRef={editableRef}
suggestion={suggestion}
/>
)}
{isLoading && (
<span className="absolute top-2 right-2 text-xs text-gray-400">
thinking...
</span>
)}
{!value && !suggestion && (
<div className="absolute top-3 left-3 text-gray-400 pointer-events-none">
{placeholder}
</div>
)}
</div>
)
}
function GhostOverlay({
containerRef,
suggestion,
}: {
containerRef: React.RefObject<HTMLDivElement>
suggestion: string
}) {
const container = containerRef.current
if (!container) return null
// Insert ghost text as a styled span at the cursor position
const existingGhost = container.querySelector('[data-ghost]')
if (existingGhost) existingGhost.remove()
const ghostSpan = document.createElement('span')
ghostSpan.setAttribute('data-ghost', 'true')
ghostSpan.textContent = suggestion
ghostSpan.style.opacity = '0.4'
ghostSpan.style.pointerEvents = 'none'
ghostSpan.contentEditable = 'false'
const selection = window.getSelection()
if (selection && selection.rangeCount > 0) {
const range = selection.getRangeAt(0)
range.insertNode(ghostSpan)
range.setStartAfter(ghostSpan)
range.collapse(true)
}
return null
}
Why contentEditable instead of a regular <textarea>? Because textareas are opaque — you can't render styled inline content inside them. With contentEditable, we insert a ghost <span> directly at the cursor position. The tradeoff is that contentEditable is notoriously finicky, but for this use case the complexity is manageable.
Gotcha: cleaning up ghost text on input
The trickiest bug you'll hit is ghost text contaminating the actual value. When the user types, the browser treats the ghost span as part of the content. That's why handleInput removes the ghost span before reading textContent. Without this, you'll end up with ghost text permanently baked into the value.
Optimizing token costs
Autocomplete fires constantly. If you're not careful, you'll burn through API tokens fast. Here's what helps:
// src/hooks/useAutocomplete.ts — add to requestSuggestion
const MIN_TEXT_LENGTH = 10
const MAX_CONTEXT_CHARS = 500
// Inside the debounce callback, before the fetch:
if (textBeforeCursor.trim().length < MIN_TEXT_LENGTH) {
setState({ suggestion: '', isLoading: false })
return
}
// Only send the last N characters as context
const trimmedBefore = textBeforeCursor.slice(-MAX_CONTEXT_CHARS)
Other cost-saving strategies:
- Increase debounce to 500ms for fields where users type slowly (like long-form content)
- Use a smaller model —
claude-haiku-4-5-20251001works great for short completions and costs a fraction of Sonnet - Cache completions — if the user deletes a word and retypes it, serve the cached suggestion. A simple
Map<string, string>with the last 50 entries works fine - Skip when cursor isn't at the end — if the user is editing in the middle of text, autocomplete is usually more annoying than helpful
Here's a quick cache implementation:
// src/lib/suggestion-cache.ts
const MAX_CACHE_SIZE = 50
export class SuggestionCache {
private cache = new Map<string, string>()
private makeKey(text: string): string {
// Use last 100 chars as cache key for locality
return text.slice(-100).trim()
}
get(textBeforeCursor: string): string | undefined {
return this.cache.get(this.makeKey(textBeforeCursor))
}
set(textBeforeCursor: string, suggestion: string): void {
const key = this.makeKey(textBeforeCursor)
if (this.cache.size >= MAX_CACHE_SIZE) {
// Delete oldest entry
const firstKey = this.cache.keys().next().value
if (firstKey !== undefined) {
this.cache.delete(firstKey)
}
}
this.cache.set(key, suggestion)
}
}
Adding keyboard shortcuts
Tab-to-accept is the baseline. But power users expect more:
// Add to handleKeyDown in AutocompleteTextarea
const handleKeyDown = useCallback(
(e: KeyboardEvent<HTMLDivElement>) => {
if (e.key === 'Tab' && suggestion) {
e.preventDefault()
const accepted = acceptSuggestion()
onChange(value + accepted)
return
}
// Accept word-by-word with Ctrl+Right
if (e.key === 'ArrowRight' && e.ctrlKey && suggestion) {
e.preventDefault()
const nextWord = suggestion.match(/^\S+\s?/)
if (nextWord) {
const partial = nextWord[0]
const newValue = value + partial
onChange(newValue)
// Trim the accepted word from the suggestion
requestSuggestion(newValue, '')
}
return
}
if (e.key === 'Escape') {
cancelSuggestion()
}
},
[suggestion, acceptSuggestion, cancelSuggestion, value, onChange, requestSuggestion],
)
Ctrl+Right to accept word-by-word is the killer feature. Sometimes the AI suggests a full sentence but you only want the first few words. This lets you scrub through the suggestion incrementally.
Putting it all together
Here's a minimal page using the component:
// src/pages/demo.tsx
import { useState } from 'react'
import { AutocompleteTextarea } from '../components/AutocompleteTextarea'
export default function DemoPage() {
const [emailBody, setEmailBody] = useState('')
return (
<main className="max-w-2xl mx-auto p-8">
<h1 className="text-2xl font-bold mb-4">AI Autocomplete Demo</h1>
<p className="text-gray-600 mb-6">
Start typing and pause — suggestions appear as ghost text.
Press <kbd className="px-1 py-0.5 bg-gray-100 rounded text-sm">Tab</kbd> to
accept, <kbd className="px-1 py-0.5 bg-gray-100 rounded text-sm">Ctrl+→</kbd> to
accept word-by-word, or{' '}
<kbd className="px-1 py-0.5 bg-gray-100 rounded text-sm">Esc</kbd> to dismiss.
</p>
<AutocompleteTextarea
value={emailBody}
onChange={setEmailBody}
placeholder="Write your email..."
context="This is a professional email reply"
className="bg-white"
/>
<p className="mt-4 text-sm text-gray-500">
{emailBody.length} characters
</p>
</main>
)
}
The context prop is what makes this reusable across your app. Pass "customer support reply" for a support tool, "git commit message" for a dev tool, or "product description" for a CMS. The LLM adapts its suggestions to the context without any prompt engineering on the consumer side.
When autocomplete goes wrong
A few things to watch out for in production:
Ghost text flicker. If the debounce is too short, suggestions appear and disappear rapidly. 300ms is a good starting point but bump it to 500ms if users complain.
Stale suggestions after rapid typing. The AbortController pattern handles this, but test it aggressively. Type fast, pause, type fast again. If you ever see a suggestion that doesn't match your current text, your cancellation logic has a bug.
Content-editable cursor jumping. The browser can reset the cursor position when you modify contentEditable children. Always save and restore the selection after accepting a suggestion.
Accessibility. Screen readers need to know the suggestion is there but not part of the committed text. Use aria-live="polite" on the ghost text container and aria-label on the accept action.
What's next
This autocomplete component works great for single text fields, but what if you want AI assistance across your entire app? In the next post, we'll look at why most AI agent architectures are overengineered and when a simple completion endpoint like this one is all you actually need — no agents, no chains, no frameworks.