PocketAgent — local chat

A real LLM running entirely inside this browser tab on your own GPU — via WebGPU + WebLLM. Markdown answers, live tokens/sec, your installed PocketAgent (if any) baked into the system prompt. No inference server, no API key — your prompts and the model's replies never leave the tab. First load downloads the weights once, then they're cached locally.

Model

not loaded

checking WebGPU…

first time? The model weights are fetched from the MLC WebLLM CDN and cached in your browser. Subsequent loads are local. Plan for 0.3–2.4 GB on first use. Needs WebGPU — Chrome/Edge on desktop, Safari 18+, or recent Firefox (older versions behind a flag).

Active agent

Install an agent from the gallery, paste a #pa=… URL, or train your own on EML-Foundation.

system prompt sent with every message

← PocketAgent landing · Ollama bridge · spec

PocketAgent — local chat

Model

Active agent

Chat