PocketAgent — local chat

A real LLM running entirely inside this browser tab on your own GPU — via WebGPU + WebLLM. Markdown answers, live tokens/sec, your installed PocketAgent baked into the system prompt. Zero server, zero API key, nothing leaves the tab. First load downloads the weights once, then they're cached locally.

Model

not loaded
checking WebGPU…

first time? The model weights are fetched from the MLC WebLLM CDN and cached in your browser. Subsequent loads are local. Plan for 0.3–2.4 GB on first use. Needs WebGPU — Chrome/Edge on desktop, Safari 16.4+, Firefox 113+ with flag.

Active agent

(no agent installed)

Install an agent from the gallery, paste a #pa=… URL, or train your own on EML-Foundation.

Chat

system prompt sent with every message

PocketAgent landing · Ollama bridge · spec