SiteMCP
Fetch an entire site and use it as an MCP Server
https://github.com/user-attachments/assets/ebe2d7c6-4ddc-4a37-8e1e-d80fac49d8ae
Demo in Japanese
https://github.com/user-attachments/assets/24288140-be2a-416c-9e7c-c49be056a373
Install
One-off usage (choose one of the followings):
bunx sitemcp
npx sitemcp
pnpx sitemcp
Install globally (choose one of the followings):
bun i -g sitemcp
npm i -g sitemcp
pnpm i -g sitemcp
Usage
sitemcp https://daisyui.com
# or better concurrency
sitemcp https://daisyui.com --concurrency 10
Tool Name Strategy
Use -t, --tool-name-strategy to specify the tool name strategy, it will be used as the MCP server name (default: domain).
This will be used as the MCP server name.
sitemcp https://vite.dev -t domain # indexOfVite / getDocumentOfVite
sitemcp https://react-tweet.vercel.app/ -t subdomain # indexOfReactTweet / getDocumentOfReactTweet
sitemcp https://ryoppippi.github.io/vite-plugin-favicons/ -t pathname # indexOfVitePluginFavicons / getDocumentOfVitePluginFavicons
Max Length of Content
Use -l, --max-length to specify the max length of content, default is 2000 characters.
This is useful for sites with long content, such as blogs or documentation.
The acceptable content length depends on the MCP client you are using, so please check the documentation of your MCP client for more details.
Also welcome to open an issue if you have any questions.
sitemcp https://vite.dev -l 10000
Match specific pages
Use the -m, --match flag to specify the pages you want to fetch:
sitemcp https://vite.dev -m "/blog/**" -m "/guide/**"
The match pattern is tested against the pathname of target pages, powered by micromatch, you can check out all the supported matching features.
Content selector
We use mozilla/readability to extract readable content from the web page, but on some pages it might return irrelevant contents, in this case you can specify a CSS selector so we know where to find the readable content:
sitemcp https://vite.dev --content-selector ".content"
How to configure with MCP Client
You can execute server from your MCP client (e.g. Claude Desktop).
The below example configuration for Claude Desktop
{
"mcpServers": {
"daisy-ui": {
"command": "npx",
"args": [
"-y",
"sitemcp",
"https://daisyui.com",
"-m",
"/components/**"
]
}
}
}
Tips
- Some site has a lot of pages. It is better to run
sitemcpbefore registering the server to the MCP client.sitemcpcaches the pages in~/.cache/sitemcpby default. You can disable by--no-cacheflag.
License
MIT.
Sponsors
Stats
Recommend MCP Servers 💡
shadow-cljs-mcp
An MCP server that monitors shadow-cljs builds and provides real-time build status updates, allowing LLMs to verify build status after making changes to ClojureScript files.
OpsLevel
OpsLevel is an internal developer portal that acts as an MCP server, providing real-time, unified context from your entire tech stack to power AI assistants and streamline software development.
mcp-todoist
MCP server that integrates with Todoist for natural language task management
searchAPI-mcp
A Model Context Protocol (MCP) based search API server that provides standardized access to Google Maps, Google Flights, Google Hotels, and other search services like Google, Bing, Baidu, etc., enabling AI assistants to access various search services through a unified interface.
sneharao/wheather-mcp-server
A custom weather MCP server that provides weather alerts and information through Claude Desktop by location prompts.
Ergo Explorer MCP
A comprehensive MCP server providing AI assistants with direct access to Ergo blockchain data, enabling structured data access, complex analysis, and ecosystem monitoring.