Supadata MCP Server
A Model Context Protocol (MCP) server that integrates with Supadata for video transcript extraction, web scraping, crawling, and site discovery.
Features
- Video transcript extraction from YouTube, TikTok, Instagram, Twitter, and file URLs
- Web scraping, crawling, and URL discovery
- Automatic retries and rate limiting
Installation
Connect your AI assistant to Supadata's MCP server to enable transcript extraction and web scraping capabilities directly in your workflow.
Claude Code
claude mcp add --transport http supadata https://api.supadata.ai/mcp \\
--header "x-api-token: YOUR_SUPADATA_API_TOKEN"
Claude Desktop
Add to your config file:
- macOS:
~/Library/Application Support/Claude/claude_desktop_config.json - Windows:
%APPDATA%\\Claude\\claude_desktop_config.json
{
"mcpServers": {
"supadata": {
"url": "https://api.supadata.ai/mcp",
"headers": {
"x-api-token": "YOUR_SUPADATA_API_TOKEN"
}
}
}
}
Cursor
Add to .cursor/mcp.json in your project root (or global config):
{
"mcpServers": {
"supadata": {
"url": "https://api.supadata.ai/mcp",
"headers": {
"x-api-token": "YOUR_SUPADATA_API_TOKEN"
}
}
}
}
Windsurf
Add to ~/.codeium/windsurf/mcp_config.json:
{
"mcpServers": {
"supadata": {
"serverUrl": "https://api.supadata.ai/mcp",
"headers": {
"x-api-token": "YOUR_SUPADATA_API_TOKEN"
}
}
}
}
VS Code + Copilot
Add to your VS Code settings.json:
{
"mcp": {
"servers": {
"supadata": {
"url": "https://api.supadata.ai/mcp",
"headers": {
"x-api-token": "YOUR_SUPADATA_API_TOKEN"
}
}
}
}
}
Cline (VS Code Extension)
Open Cline settings and add to the MCP Servers configuration:
{
"supadata": {
"url": "https://api.supadata.ai/mcp",
"headers": {
"x-api-token": "YOUR_SUPADATA_API_TOKEN"
}
}
}
Replace YOUR_SUPADATA_API_TOKEN with your API token from supadata.ai.
Configuration
Environment Variables
SUPADATA_API_KEY: Your Supadata API key
System Configuration
The server includes configurable retry and rate limiting parameters:
const CONFIG = {
retry: {
maxAttempts: 3, // Number of retry attempts
initialDelay: 1000, // Initial delay (milliseconds)
maxDelay: 10000, // Maximum delay between retries (milliseconds)
backoffFactor: 2 // Exponential backoff multiplier
}
};
How to Choose a Tool
Select the right tool based on your needs:
- Transcript: Extract video transcripts from platforms and file URLs
- Scrape: Extract content from a single page when you know the exact URL
- Map: Discover all available URLs on a website
- Crawl: Extract content from multiple related pages comprehensively
| Tool | Best for | Returns |
|---|---|---|
| transcript | Video transcript extraction | text/markdown |
| scrape | Single page content | markdown/html |
| map | URL discovery on a site | URL[] |
| crawl | Multi-page extraction | markdown/html[] |
Available Tools
Transcript (supadata_transcript)
Extract transcripts from supported video platforms (YouTube, TikTok, Instagram, Twitter) and file URLs.
Usage:
supadata_transcript --url "https://youtube.com/watch?v=example" --lang "en"
Check Transcript Status (supadata_check_transcript_status)
Check the progress of a transcript extraction job using the job ID.
Usage:
supadata_check_transcript_status --id "550e8400-e29b-41d4-a716-446655440000"
Scrape (supadata_scrape)
Extract content from a single URL with advanced options.
Usage:
supadata_scrape --url "https://example.com" --lang "en"
Map (supadata_map)
Discover all indexed URLs on a website to find relevant pages before scraping.
Usage:
supadata_map --url "https://example.com"
Crawl (supadata_crawl)
Start an asynchronous crawl job to extract content from multiple pages on a site.
Usage:
supadata_crawl --url "https://example.com/blog" --limit 100
Check Crawl Status (supadata_check_crawl_status)
Check the progress of a crawl job using the job ID.
Usage:
supadata_check_crawl_status --id "550e8400-e29b-41d4-a716-446655440000"
Development
# Install dependencies
npm install
# Build
npm run build
# Run tests
npm test
Contributing
- Fork the repository
- Create your feature branch
- Run tests:
npm test - Submit a pull request
License
MIT License - see LICENSE file for details
Recommend MCP Servers 💡
zettelkasten-mcp
A Model Context Protocol (MCP) server that implements the Zettelkasten knowledge management methodology, allowing you to create, link, explore and synthesize atomic notes through Claude and other MCP-compatible clients.
@powerdrillai/powerdrill-mcp
A Model Context Protocol (MCP) server that enables interaction with Powerdrill datasets using User ID and Project API Key.
@mzxrai/mcp-openai
Chat with OpenAI models from Claude Desktop
graphlit-mcp-server
Model Context Protocol (MCP) Server for Graphlit Platform
me_mcp_server
An MCP server designed to learn about and interact with a user's personal profile, providing features like job search instructions and access to personal resources (resume, LinkedIn, GitHub, website).
kich555/github-mcp-server
MCP Server for GitHub API enabling file, repo management, search, etc.