
Browser Use MCP
AI-powered browser automation agent with natural language control
Browser Use MCP Server is an AI-powered browser automation tool that combines OpenAI's GPT-4o with a real Chromium browser. Instead of writing automation scripts, describe what you want in natural language — "go to this site and fill out the contact form" — and the AI agent figures out how to do it. Exposes two MCP tools for sending tasks and retrieving results.

Why Browser Use MCP?
Traditional browser automation requires writing and maintaining brittle scripts. When websites change their layout, selectors break and scripts fail. Building robust automation for dynamic pages takes significant engineering effort. You need an approach that understands pages like a human does — visually and contextually.
How It Works
Browser Use combines an AI vision model (GPT-4o) with a real browser. You send a natural language task via MCP, and the AI agent sees the page, decides what actions to take, and executes them step by step. It handles navigation, clicking, typing, and form filling autonomously. The agent adapts to page changes without script updates.
What Is Browser Use MCP?
Browser Use MCP Server is an AI browser automation agent. It provides two MCP tools: browser_use (send a URL and action description) and browser_get_result (poll for task completion). Runs in Docker with VNC access for visual monitoring. Requires an OpenAI API key for GPT-4o vision. MIT licensed.
Key Benefits
Why teams choose Browser Use MCP
Natural Language
Describe tasks in plain English instead of writing automation scripts. The AI figures out the clicks and typing.
AI Vision
Uses GPT-4o to see and understand web pages visually, just like a human would.
Self-Healing
No brittle CSS selectors. The AI adapts when page layouts change without script updates.
VNC Monitoring
Watch the AI work in real-time through VNC. See exactly what the agent sees and does.
MCP Compatible
Standard MCP interface works with Claude, GPT, and any MCP-compatible AI assistant.
Async Tasks
Send tasks and poll for results. Run multiple browser automations in parallel.
Features
Everything you need to build with Browser Use MCP
Task Execution
Send a URL and natural language instruction. The AI completes the task autonomously.
Result Polling
Async task model — send a task, get a task ID, poll for completion.
Patient Mode
PATIENT=true waits for task completion synchronously instead of returning a task ID.
VNC Access
Port 5900 provides VNC access to watch the browser in real-time.
Step Control
Configure MAX_AGENT_STEPS to limit how many actions the AI takes per task.
SSE Transport
MCP server on port 8000 with Server-Sent Events transport.
Use Cases
What you can build with Browser Use MCP
Technology Stack
Ready to deploy Browser Use MCP?
Get started in minutes. Deploy on your own infrastructure at actual cloud cost. No markup, no vendor lock-in.