Getting started
browxai is an MCP server that gives an AI agent a curated browser-control surface. It runs over stdio and is driven by any MCP client.
Install
Section titled “Install”npm install -g browxainpx playwright-core install chromium # one-time, ~150 MBA global install puts the browxai binary on your PATH, so an MCP client can
launch it by name (command: "browxai"). The binary is the MCP server on the
stdio transport.
Wire it into an MCP client
Section titled “Wire it into an MCP client”Add browxai to your client’s MCP server config. For example, an .mcp.json:
{ "mcpServers": { "browxai": { "command": "browxai", }, },}By default the server launches a managed Chromium with its own profile,
headed, with the default capability set (read, navigation, action,
human). Everything dangerous is opt-in.
Common environment variables
Section titled “Common environment variables”| Variable | Purpose |
|---|---|
BROWX_WORKSPACE | Where all transient state lives (default ~/.browxai/). Never cwd. |
BROWX_HEADLESS | 1 launches headless. |
BROWX_CAPABILITIES | Comma-separated capability set. Add eval, network-body, clipboard, or file-io to opt into gated tools. |
BROWX_ENGINE | Browser engine: chromium (default), firefox, webkit, android, or safari. Also --engine <kind>. |
BROWX_ATTACH_CDP | Loopback CDP endpoint to attach to an existing Chrome (BYOB). |
See the tool reference for the full configuration surface, and note that capabilities are resolved once at server start: changing them means restarting the server.
A first flow
Section titled “A first flow”A typical agent loop:
navigateto a URL.snapshotto get the accessibility tree plus DOM-walk; every node has a stable[ref=eN].findto describe the target in natural language; get ranked candidates with astabilityflag, anactionableverdict, and a visible-rectbbox.click/fill/ … to act byref; each returns a structuredActionResultdescribing what navigated, what structure changed, and a console/network slice.
For verification use text_search, inspect, and the read tools; for
flaky or transient UI use wait_for, sample, and act_and_sample.
Where to go next
Section titled “Where to go next”- Tool reference is every tool, its inputs and outputs, example calls, the configuration and session model, and the stability policy.
- Agent guidance is the reach-for-this-not-that map: the footguns agents hit and the curated tool that avoids each one.
- Security and threat model is the capability model, what browxai defends against, and what it explicitly does not.