Domain Crawling

Our Domain Crawling API crawls a website and automatically scrapes every discovered URL. Perfect for crawling a whole domain in a single request.

What is Domain Crawling?

Domain Crawling actively browses a website by following links from the homepage and scrapes each page it finds.

Live scraper — discovered URLs are scraped according to your patterns and the content is returned. No separate scraping step needed.

How It Works

  1. Starts from the domain homepage
  2. Follows links up to specified depth
  3. Scrapes every discovered URL
  4. Returns both the URLs and scraped content

Parameters Overview

Parameter Type Required Description
domain string Yes Domain to crawl (e.g., “example.com”)
max_urls integer Yes Maximum URLs to discover and scrape (1-10,000)
depth integer No How deep to follow links. Default: 1
url_pattern string No Regex pattern to filter which URLs to scrape
scraper_config object No Custom scraper settings. Default: raw HTML
async boolean No Process in background. Default: false

Scraper Config

When you provide a scraper_config, it overrides the default settings:

curl -X POST "https://scrape.evomi.com/api/v1/scraper/crawl" \
  -H "x-api-key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "domain": "example.com",
    "max_urls": 10,
    "scraper_config": {
      "mode": "request",
     
    }
  }'

Default behavior: If no scraper_config is provided, each URL is scraped with raw HTML output (the default scraper mode).

Pricing

Domain Crawling uses your scraper credits. Each scraped page costs credits according to Scraper API pricing.

Quick Start

Crawl a domain and scrape all discovered URLs:

curl -X POST "https://scrape.evomi.com/api/v1/scraper/crawl" \
  -H "x-api-key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "domain": "example.com",
    "max_urls": 10
  }'

Response:

{
  "success": true,
  "domain": "example.com",
  "discovered_count": 10,
  "scraper_tasks_submitted": 10,
  "results": [
    {
      "url": "https://example.com/",
      "source": "crawl",
      "scrape_task_id": "abc-123",
      "scrape_check_url": "/api/v1/scraper/tasks/abc-123"
    },
    {
      "url": "https://example.com/about",
      "source": "crawl",
      "scrape_task_id": "def-456",
      "scrape_check_url": "/api/v1/scraper/tasks/def-456"
    }
  ],
  "credits_used": 20.0,
  "credits_remaining": 980.0
}

Each result includes a scrape_task_id to check individual scraping status.

Filter with URL Patterns

Only scrape URLs matching a pattern:

curl -X POST "https://scrape.evomi.com/api/v1/scraper/crawl" \
  -H "x-api-key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "domain": "example.com",
    "url_pattern": "/blog/.*",
    "max_urls": 20
  }'

Control Crawl Depth

Default depth is 1 (pages linked from homepage). Increase for deeper discovery:

Depth 1 — Homepage and direct links:

curl -X POST "https://scrape.evomi.com/api/v1/scraper/crawl" \
  -H "x-api-key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "domain": "example.com",
    "depth": 1,
    "max_urls": 50
  }'

Depth 2 — Homepage, direct links, and their links:

curl -X POST "https://scrape.evomi.com/api/v1/scraper/crawl" \
  -H "x-api-key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "domain": "example.com",
    "depth": 2,
    "max_urls": 100
  }'

Custom Scraper Settings

Override default scraping behavior:

curl -X POST "https://scrape.evomi.com/api/v1/scraper/crawl" \
  -H "x-api-key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "domain": "example.com",
    "max_urls": 10,
    "scraper_config": {
      "mode": "request",
     
    }
  }'

Async Mode

For large crawls, use async mode:

curl -X POST "https://scrape.evomi.com/api/v1/scraper/crawl" \
  -H "x-api-key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "domain": "example.com",
    "max_urls": 100,
    "async": true
  }'

Response (202 Accepted):

{
  "success": true,
  "task_id": "550e8400-e29b-41d4-a716-446655440000",
  "status": "processing",
  "message": "Crawl task submitted for background processing",
  "check_url": "/api/v1/scraper/crawl/tasks/550e8400-e29b-41d4-a716-446655440000"
}

Check status:

curl "https://scrape.evomi.com/api/v1/scraper/crawl/tasks/550e8400-e29b-41d4-a716-446655440000" \
  -H "x-api-key: YOUR_API_KEY"

Common Use Cases

Complete Site Scraping Crawl an entire website and get all content in one request.

Product Discovery Find all product pages and scrape them for pricing, inventory, or details.

Content Aggregation Build a content library by crawling and scraping multiple pages.

Competitive Intelligence Automatically gather competitor content across many pages.

Base URL

https://scrape.evomi.com

All API requests use this base URL with the /api/v1/scraper/crawl endpoint.

Error Handling

Status Meaning
200 Success
202 Async task processing
400 Bad request
401 Unauthorized
402 Insufficient credits
404 Task not found
429 Rate limit exceeded

Next Steps

⚠️
Each discovered URL is scraped, so costs can add up quickly. Start with small max_urls values to understand your costs before scaling.