Site Crawler MCP

A powerful Model Context Protocol server for comprehensive website analysis with 12 different extraction modes. Multi-mode crawling in a single pass with advanced pattern matching and rate limiting.

12 Extraction Modes

Choose from comprehensive extraction modes tailored for different use cases

πŸ–ΌοΈ

Images

Extract all images with metadata, file sizes, formats, and alt-text analysis

πŸ“Š

SEO Metadata

Complete SEO analysis including title tags, meta descriptions, and structured data

🏒

Brand Information

Extract brand assets, logos, company information, and brand guidelines

⚑

Performance Metrics

Analyze loading times, resource sizes, and performance optimization opportunities

πŸ”’

Security Headers

Comprehensive security analysis including SSL, HTTPS, and security headers

βš–οΈ

Legal Compliance

Detect privacy policies, GDPR compliance, terms of service, and legal pages

πŸ—οΈ

Infrastructure

Technical infrastructure details, hosting information, and technology stack

πŸ’Ό

Career Information

Extract job postings, career pages, and hiring information

πŸ“ž

Contact Details

Find contact information, addresses, phone numbers, and social media links

πŸ‘₯

Client References

Extract client testimonials, case studies, and business references

πŸ›’

E-commerce Assets

Product images, pricing information, and e-commerce platform analysis

πŸ”

Content Analysis

Text analysis, keyword density, content structure, and readability metrics

Usage Examples

See how to use Site Crawler MCP in different scenarios

Basic Image Extraction

{
  "modes": ["images"],
  "depth": 2
}

Extract all images from a website up to 2 levels deep with metadata

SEO + Images Combined

{
  "modes": ["images", "meta"],
  "depth": 1
}

Get both image assets and SEO metadata in a single crawl session

Complete Site Analysis

{
  "modes": ["images", "meta", "security", "brand"],
  "depth": 3
}

Comprehensive analysis including security headers, brand info, and compliance

Contributors

Meet the team behind Site Crawler MCP

Technical Specifications

Built with modern Python technologies for optimal performance

🐍

Python 3.10+

Modern Python with async/await support for concurrent operations

⚑

Async HTTP

Built with aiohttp for high-performance asynchronous web crawling

πŸ”§

BeautifulSoup4

Advanced HTML parsing and extraction with CSS selectors

🚦

Rate Limiting

1-2 second delays with maximum 5 concurrent requests

πŸ”„

Retry Logic

Automatic retry with exponential backoff for failed requests

πŸ“±

MCP Protocol

Native Model Context Protocol integration for AI assistants

Installation Methods

Multiple installation options to suit your development environment

PyPI Installation (Coming Soon)

pip install site-crawler-mcp

Simple installation from Python Package Index when available

Source Installation with uv

git clone https://github.com/AndacGuven/site-crawler-mcp.git
cd site-crawler-mcp
uv pip install -e .

Development installation using the modern uv package manager

Virtual Environment Setup

python -m venv venv
source venv/bin/activate  # or venv\Scripts\activate on Windows
pip install -e .

Recommended setup with isolated Python environment