How to Build an AI-Powered Autocomplete for Any Text Input

Friday 27/03/2026

·11 min read

You're typing in a text field and a faint suggestion appears ahead of your cursor - press Tab to accept, keep typing to ignore. GitHub Copilot made this interaction feel magical, and now your users expect it everywhere. But building AI-powered autocomplete for a regular textarea is surprisingly tricky: you need to manage streaming, debouncing, request cancellation, and ghost text rendering without turning your input into a laggy mess.

This tutorial builds a reusable React component that streams AI autocomplete suggestions as inline ghost text for any text input. It works with any LLM provider - we'll use the Vercel AI SDK for the streaming layer and Claude as the model, but swapping to OpenAI or a local model is a one-line change.

The architecture

Here's what happens on every keystroke (or rather, after a debounced pause):

User stops typing for 300ms
We send the current text + cursor position to an API route
The API streams back a completion token-by-token
Ghost text appears inline after the cursor
User presses Tab to accept or keeps typing to dismiss

The key insight: we don't render the ghost text in a separate overlay. We use a single <div contentEditable> with a <span> for the suggestion styled with reduced opacity. This avoids the positioning nightmare of overlaying elements on a textarea.

pnpm install ai @ai-sdk/anthropic

The API route

The backend is a Next.js API route that takes the text before the cursor and streams a completion.

// src/pages/api/autocomplete.ts
import { streamText } from 'ai'
import { anthropic } from '@ai-sdk/anthropic'
import type { NextRequest } from 'next/server'

export const config = { runtime: 'edge' }

export default async function handler(req: NextRequest) {
    if (req.method !== 'POST') {
        return new Response('Method not allowed', { status: 405 })
    }

    const { textBeforeCursor, textAfterCursor, context } = await req.json() as {
        textBeforeCursor: string
        textAfterCursor: string
        context?: string
    }

    if (!textBeforeCursor || textBeforeCursor.trim().length < 5) {
        return new Response('', { status: 200 })
    }

    const result = streamText({
        model: anthropic('claude-sonnet-4-5-20250514'),
        maxTokens: 60,
        temperature: 0.3,
        system: `You are an inline text autocomplete engine. Given the text before and after the cursor, predict what the user is about to type next. Rules:
- Output ONLY the predicted text, nothing else
- Keep suggestions short (1-2 sentences max)
- Match the tone and style of the existing text
- If the text after the cursor already continues naturally, return empty
- Do not repeat text that already exists before the cursor
${context ? `Context about this text field: ${context}` : ''}`,
        prompt: `Text before cursor: "${textBeforeCursor}"${textAfterCursor ? `\nText after cursor: "${textAfterCursor}"` : ''}`,
    })

    return result.toDataStreamResponse()
}

A few things worth noting:

maxTokens: 60 keeps suggestions short. Nobody wants a wall of ghost text.
temperature: 0.3 makes completions predictable. Autocomplete should feel obvious, not creative.
We return early if the text is too short - no point suggesting after "Hi".
The context parameter lets you hint at what the field is for (e.g., "This is a customer support email reply").

The ghost text hook

This is where the real complexity lives. We need to manage debouncing, AbortController for cancellation, and the streaming state.

// src/hooks/useAutocomplete.ts
import { useCallback, useRef, useState } from 'react'

interface AutocompleteOptions {
    debounceMs?: number
    context?: string
    enabled?: boolean
}

interface AutocompleteState {
    suggestion: string
    isLoading: boolean
}

export function useAutocomplete(options: AutocompleteOptions = {}) {
    const { debounceMs = 300, context, enabled = true } = options
    const [state, setState] = useState<AutocompleteState>({
        suggestion: '',
        isLoading: false,
    })

    const abortControllerRef = useRef<AbortController | null>(null)
    const debounceTimerRef = useRef<ReturnType<typeof setTimeout> | null>(null)

    const cancelSuggestion = useCallback(() => {
        if (debounceTimerRef.current) {
            clearTimeout(debounceTimerRef.current)
            debounceTimerRef.current = null
        }
        if (abortControllerRef.current) {
            abortControllerRef.current.abort()
            abortControllerRef.current = null
        }
        setState({ suggestion: '', isLoading: false })
    }, [])

    const requestSuggestion = useCallback(
        (textBeforeCursor: string, textAfterCursor: string) => {
            if (!enabled) return

            // Cancel any in-flight request
            cancelSuggestion()

            debounceTimerRef.current = setTimeout(async () => {
                const controller = new AbortController()
                abortControllerRef.current = controller

                setState((prev) => ({ ...prev, isLoading: true }))

                try {
                    const response = await fetch('/api/autocomplete', {
                        method: 'POST',
                        headers: { 'Content-Type': 'application/json' },
                        body: JSON.stringify({ textBeforeCursor, textAfterCursor, context }),
                        signal: controller.signal,
                    })

                    if (!response.ok || !response.body) {
                        setState({ suggestion: '', isLoading: false })
                        return
                    }

                    const reader = response.body.getReader()
                    const decoder = new TextDecoder()
                    let fullSuggestion = ''

                    while (true) {
                        const { done, value } = await reader.read()
                        if (done) break

                        const chunk = decoder.decode(value, { stream: true })
                        // Parse the Vercel AI SDK data stream format
                        const lines = chunk.split('\n')
                        for (const line of lines) {
                            // Text parts in the data stream start with '0:'
                            if (line.startsWith('0:')) {
                                const text = JSON.parse(line.slice(2)) as string
                                fullSuggestion += text
                                setState({ suggestion: fullSuggestion, isLoading: false })
                            }
                        }
                    }
                } catch (err: unknown) {
                    if (err instanceof Error && err.name !== 'AbortError') {
                        console.error('Autocomplete error:', err)
                    }
                    setState({ suggestion: '', isLoading: false })
                }
            }, debounceMs)
        },
        [enabled, debounceMs, context, cancelSuggestion],
    )

    const acceptSuggestion = useCallback(() => {
        const accepted = state.suggestion
        setState({ suggestion: '', isLoading: false })
        return accepted
    }, [state.suggestion])

    return {
        suggestion: state.suggestion,
        isLoading: state.isLoading,
        requestSuggestion,
        cancelSuggestion,
        acceptSuggestion,
    }
}

The critical pattern here is cancel-before-request. Every new keystroke cancels the previous debounce timer AND aborts any in-flight fetch. Without this, you get stale suggestions appearing after the user has moved on.

The autocomplete textarea component

Now we combine the hook with a contentEditable div that renders ghost text inline.

// src/components/AutocompleteTextarea.tsx
import React, { useRef, useCallback, KeyboardEvent, FormEvent } from 'react'
import { useAutocomplete } from '../hooks/useAutocomplete'

interface AutocompleteTextareaProps {
    value: string
    onChange: (value: string) => void
    placeholder?: string
    context?: string
    className?: string
}

export function AutocompleteTextarea({
    value,
    onChange,
    placeholder = 'Start typing...',
    context,
    className = '',
}: AutocompleteTextareaProps) {
    const editableRef = useRef<HTMLDivElement>(null)
    const {
        suggestion,
        isLoading,
        requestSuggestion,
        cancelSuggestion,
        acceptSuggestion,
    } = useAutocomplete({ context })

    const getTextAroundCursor = useCallback((): {
        before: string
        after: string
    } => {
        const selection = window.getSelection()
        if (!selection || !selection.rangeCount || !editableRef.current) {
            return { before: value, after: '' }
        }

        const range = selection.getRangeAt(0)
        const preRange = document.createRange()
        preRange.setStart(editableRef.current, 0)
        preRange.setEnd(range.startContainer, range.startOffset)

        const before = preRange.toString()
        const after = value.slice(before.length)
        return { before, after }
    }, [value])

    const handleInput = useCallback(
        (e: FormEvent<HTMLDivElement>) => {
            const target = e.currentTarget
            // Remove any ghost text spans before reading content
            const ghost = target.querySelector('[data-ghost]')
            if (ghost) ghost.remove()

            const newValue = target.textContent ?? ''
            onChange(newValue)

            const { before, after } = getTextAroundCursor()
            requestSuggestion(before, after)
        },
        [onChange, getTextAroundCursor, requestSuggestion],
    )

    const handleKeyDown = useCallback(
        (e: KeyboardEvent<HTMLDivElement>) => {
            if (e.key === 'Tab' && suggestion) {
                e.preventDefault()
                const accepted = acceptSuggestion()
                const newValue = value + accepted
                onChange(newValue)

                // Update the contentEditable and place cursor at end
                if (editableRef.current) {
                    editableRef.current.textContent = newValue
                    const range = document.createRange()
                    const selection = window.getSelection()
                    range.selectNodeContents(editableRef.current)
                    range.collapse(false)
                    selection?.removeAllRanges()
                    selection?.addRange(range)
                }
            }

            if (e.key === 'Escape') {
                cancelSuggestion()
            }
        },
        [suggestion, acceptSuggestion, cancelSuggestion, value, onChange],
    )

    return (
        <div className="relative">
            <div
                ref={editableRef}
                contentEditable
                suppressContentEditableWarning
                role="textbox"
                aria-placeholder={placeholder}
                aria-label="Text input with AI autocomplete"
                className={`min-h-[120px] p-3 border rounded-lg focus:outline-none focus:ring-2
                    focus:ring-blue-500 whitespace-pre-wrap break-words ${className}`}
                onInput={handleInput}
                onKeyDown={handleKeyDown}
                onBlur={cancelSuggestion}
            />
            {suggestion && (
                <GhostOverlay
                    containerRef={editableRef}
                    suggestion={suggestion}
                />
            )}
            {isLoading && (
                <span className="absolute top-2 right-2 text-xs text-gray-400">
                    thinking...
                </span>
            )}
            {!value && !suggestion && (
                <div className="absolute top-3 left-3 text-gray-400 pointer-events-none">
                    {placeholder}
                </div>
            )}
        </div>
    )
}

function GhostOverlay({
    containerRef,
    suggestion,
}: {
    containerRef: React.RefObject<HTMLDivElement>
    suggestion: string
}) {
    const container = containerRef.current
    if (!container) return null

    // Insert ghost text as a styled span at the cursor position
    const existingGhost = container.querySelector('[data-ghost]')
    if (existingGhost) existingGhost.remove()

    const ghostSpan = document.createElement('span')
    ghostSpan.setAttribute('data-ghost', 'true')
    ghostSpan.textContent = suggestion
    ghostSpan.style.opacity = '0.4'
    ghostSpan.style.pointerEvents = 'none'
    ghostSpan.contentEditable = 'false'

    const selection = window.getSelection()
    if (selection && selection.rangeCount > 0) {
        const range = selection.getRangeAt(0)
        range.insertNode(ghostSpan)
        range.setStartAfter(ghostSpan)
        range.collapse(true)
    }

    return null
}

Why contentEditable instead of a regular <textarea>? Because textareas are opaque - you can't render styled inline content inside them. With contentEditable, we insert a ghost <span> directly at the cursor position. The tradeoff is that contentEditable is notoriously finicky, but for this use case the complexity is manageable.

Gotcha: cleaning up ghost text on input

The trickiest bug you'll hit is ghost text contaminating the actual value. When the user types, the browser treats the ghost span as part of the content. That's why handleInput removes the ghost span before reading textContent. Without this, you'll end up with ghost text permanently baked into the value.

Optimizing token costs

Autocomplete fires constantly. If you're not careful, you'll burn through API tokens fast. Here's what helps:

// src/hooks/useAutocomplete.ts - add to requestSuggestion
const MIN_TEXT_LENGTH = 10
const MAX_CONTEXT_CHARS = 500

// Inside the debounce callback, before the fetch:
if (textBeforeCursor.trim().length < MIN_TEXT_LENGTH) {
    setState({ suggestion: '', isLoading: false })
    return
}

// Only send the last N characters as context
const trimmedBefore = textBeforeCursor.slice(-MAX_CONTEXT_CHARS)

Other cost-saving strategies:

Increase debounce to 500ms for fields where users type slowly (like long-form content)
Use a smaller model - claude-haiku-4-5-20251001 works great for short completions and costs a fraction of Sonnet
Cache completions - if the user deletes a word and retypes it, serve the cached suggestion. A simple Map<string, string> with the last 50 entries works fine
Skip when cursor isn't at the end - if the user is editing in the middle of text, autocomplete is usually more annoying than helpful

Here's a quick cache implementation:

// src/lib/suggestion-cache.ts
const MAX_CACHE_SIZE = 50

export class SuggestionCache {
    private cache = new Map<string, string>()

    private makeKey(text: string): string {
        // Use last 100 chars as cache key for locality
        return text.slice(-100).trim()
    }

    get(textBeforeCursor: string): string | undefined {
        return this.cache.get(this.makeKey(textBeforeCursor))
    }

    set(textBeforeCursor: string, suggestion: string): void {
        const key = this.makeKey(textBeforeCursor)
        if (this.cache.size >= MAX_CACHE_SIZE) {
            // Delete oldest entry
            const firstKey = this.cache.keys().next().value
            if (firstKey !== undefined) {
                this.cache.delete(firstKey)
            }
        }
        this.cache.set(key, suggestion)
    }
}

Adding keyboard shortcuts

Tab-to-accept is the baseline. But power users expect more:

// Add to handleKeyDown in AutocompleteTextarea
const handleKeyDown = useCallback(
    (e: KeyboardEvent<HTMLDivElement>) => {
        if (e.key === 'Tab' && suggestion) {
            e.preventDefault()
            const accepted = acceptSuggestion()
            onChange(value + accepted)
            return
        }

        // Accept word-by-word with Ctrl+Right
        if (e.key === 'ArrowRight' && e.ctrlKey && suggestion) {
            e.preventDefault()
            const nextWord = suggestion.match(/^\S+\s?/)
            if (nextWord) {
                const partial = nextWord[0]
                const newValue = value + partial
                onChange(newValue)
                // Trim the accepted word from the suggestion
                requestSuggestion(newValue, '')
            }
            return
        }

        if (e.key === 'Escape') {
            cancelSuggestion()
        }
    },
    [suggestion, acceptSuggestion, cancelSuggestion, value, onChange, requestSuggestion],
)

Ctrl+Right to accept word-by-word is the killer feature. Sometimes the AI suggests a full sentence but you only want the first few words. This lets you scrub through the suggestion incrementally.

Putting it all together

Here's a minimal page using the component:

// src/pages/demo.tsx
import { useState } from 'react'
import { AutocompleteTextarea } from '../components/AutocompleteTextarea'

export default function DemoPage() {
    const [emailBody, setEmailBody] = useState('')

    return (
        <main className="max-w-2xl mx-auto p-8">
            <h1 className="text-2xl font-bold mb-4">AI Autocomplete Demo</h1>
            <p className="text-gray-600 mb-6">
                Start typing and pause - suggestions appear as ghost text.
                Press <kbd className="px-1 py-0.5 bg-gray-100 rounded text-sm">Tab</kbd> to
                accept, <kbd className="px-1 py-0.5 bg-gray-100 rounded text-sm">Ctrl+→</kbd> to
                accept word-by-word, or{' '}
                <kbd className="px-1 py-0.5 bg-gray-100 rounded text-sm">Esc</kbd> to dismiss.
            </p>
            <AutocompleteTextarea
                value={emailBody}
                onChange={setEmailBody}
                placeholder="Write your email..."
                context="This is a professional email reply"
                className="bg-white"
            />
            <p className="mt-4 text-sm text-gray-500">
                {emailBody.length} characters
            </p>
        </main>
    )
}

The context prop is what makes this reusable across your app. Pass "customer support reply" for a support tool, "git commit message" for a dev tool, or "product description" for a CMS. The LLM adapts its suggestions to the context without any prompt engineering on the consumer side.

When autocomplete goes wrong

A few things to watch out for in production:

Ghost text flicker. If the debounce is too short, suggestions appear and disappear rapidly. 300ms is a good starting point but bump it to 500ms if users complain.

Stale suggestions after rapid typing. The AbortController pattern handles this, but test it aggressively. Type fast, pause, type fast again. If you ever see a suggestion that doesn't match your current text, your cancellation logic has a bug.

Content-editable cursor jumping. The browser can reset the cursor position when you modify contentEditable children. Always save and restore the selection after accepting a suggestion.

Accessibility. Screen readers need to know the suggestion is there but not part of the committed text. Use aria-live="polite" on the ghost text container and aria-label on the accept action.

What's next

This autocomplete component works great for single text fields, but what if you want AI assistance across your entire app? In the next post, we'll look at why most AI agent architectures are overengineered and when a simple completion endpoint like this one is all you actually need - no agents, no chains, no frameworks.

How to Build an AI-Powered Autocomplete for Any Text Input

The architecture

The API route

The ghost text hook

The autocomplete textarea component

Gotcha: cleaning up ghost text on input

Optimizing token costs

Adding keyboard shortcuts

Putting it all together

When autocomplete goes wrong

What's next

Vadim Alakhverdov

Related Posts

Run Real AI Features in the Browser with Transformers.js v4 and WebGPU

Edge RAG: Build a Sub-100ms Retrieval App with Cloudflare Workers AI and Vectorize

Give Your AI Agent Persistent Memory with Anthropic Managed Agents