Usage Examples
Practical examples showing request payloads and response formats for common Domain Crawling scenarios. All examples use the endpoint:
POST https://scrape.evomi.com/api/v1/scraper/crawlx-api-key: YOUR_API_KEYBasic URL Discovery
Example 1: Discover URLs from Sitemap
Find all URLs from a domain’s sitemap without validation.
Request:
{
"domain": "example.com",
"sources": ["sitemap"],
"max_urls": 100,
"check_if_live": false
}Response:
{
"success": true,
"domain": "example.com",
"discovered_count": 87,
"results": [
{
"url": "https://example.com/",
"source": "sitemap"
},
{
"url": "https://example.com/about",
"source": "sitemap"
},
{
"url": "https://example.com/products",
"source": "sitemap"
},
{
"url": "https://example.com/blog/post-1",
"source": "sitemap"
}
],
"credits_used": 2.0,
"credits_remaining": 998.0
}Example 2: Discover from Both Sources
Use both sitemap and Common Crawl for maximum coverage.
Request:
{
"domain": "example.com",
"sources": ["sitemap", "commoncrawl"],
"max_urls": 500,
"check_if_live": false
}Response:
{
"success": true,
"domain": "example.com",
"discovered_count": 463,
"results": [
{
"url": "https://example.com/",
"source": "sitemap"
},
{
"url": "https://example.com/old-page",
"source": "commoncrawl"
},
{
"url": "https://example.com/archived-content",
"source": "commoncrawl"
}
],
"credits_used": 4.0,
"credits_remaining": 996.0
}URL Validation
Example 3: Discover and Validate URLs
Check that discovered URLs are still live and accessible.
Request:
{
"domain": "example.com",
"sources": ["sitemap"],
"max_urls": 100,
"check_if_live": true
}Response:
{
"success": true,
"domain": "example.com",
"discovered_count": 94,
"results": [
{
"url": "https://example.com/",
"source": "sitemap"
},
{
"url": "https://example.com/about",
"source": "sitemap"
},
{
"url": "https://example.com/products",
"source": "sitemap"
}
],
"credits_used": 49.0,
"credits_remaining": 951.0
}Pattern Filtering
Example 4: Filter for Blog Posts Only
Use regex to find only blog post URLs.
Request:
{
"domain": "example.com",
"sources": ["sitemap"],
"url_pattern": "/blog/[^/]+/?$",
"max_urls": 50,
"check_if_live": true
}Response:
{
"success": true,
"domain": "example.com",
"discovered_count": 42,
"results": [
{
"url": "https://example.com/blog/introduction-to-apis",
"source": "sitemap"
},
{
"url": "https://example.com/blog/advanced-scraping-techniques",
"source": "sitemap"
},
{
"url": "https://example.com/blog/proxy-best-practices",
"source": "sitemap"
}
],
"credits_used": 23.0,
"credits_remaining": 977.0
}Example 5: Filter for Product Pages
Target product URLs with specific patterns.
Request:
{
"domain": "shop.example.com",
"sources": ["sitemap", "commoncrawl"],
"url_pattern": "/products?/[a-z0-9-]+",
"max_urls": 200,
"check_if_live": true
}Response:
{
"success": true,
"domain": "shop.example.com",
"discovered_count": 187,
"results": [
{
"url": "https://shop.example.com/product/laptop-stand",
"source": "sitemap"
},
{
"url": "https://shop.example.com/products/wireless-mouse",
"source": "sitemap"
},
{
"url": "https://shop.example.com/product/usb-hub",
"source": "commoncrawl"
}
],
"credits_used": 97.5,
"credits_remaining": 902.5
}Asynchronous Processing
Example 6: Large Domain (Async Mode)
Process large domains in the background and poll for results.
Request:
{
"domain": "largecorp.com",
"sources": ["sitemap", "commoncrawl"],
"max_urls": 5000,
"check_if_live": true,
"async": true
}Response (202 Accepted):
{
"success": true,
"task_id": "550e8400-e29b-41d4-a716-446655440000",
"status": "processing",
"message": "Crawl task submitted for background processing",
"check_url": "/api/v1/scraper/crawl/tasks/550e8400-e29b-41d4-a716-446655440000",
"credits_reserved": 2504.0
}Check Status:
GET https://scrape.evomi.com/api/v1/scraper/crawl/tasks/550e8400-e29b-41d4-a716-446655440000Status Response (Still Processing):
{
"success": true,
"task_id": "550e8400-e29b-41d4-a716-446655440000",
"status": "processing",
"message": "Task is still processing"
}Status Response (Completed):
{
"success": true,
"domain": "largecorp.com",
"discovered_count": 4732,
"results": [
{
"url": "https://largecorp.com/",
"source": "sitemap"
}
],
"credits_used": 2370.0,
"credits_remaining": 7630.0
}Integrated Scraping
Example 7: Discover and Scrape Automatically
Combine URL discovery with automatic scraping in one request.
Request:
{
"domain": "blog.example.com",
"sources": ["sitemap"],
"max_urls": 20,
"check_if_live": true,
"scraper_config": {
"mode": "auto",
"content": "markdown",
"proxy_type": "residential",
"proxy_country": "US"
}
}Response:
{
"success": true,
"domain": "blog.example.com",
"discovered_count": 18,
"scraper_tasks_submitted": 18,
"results": [
{
"url": "https://blog.example.com/post-1",
"source": "sitemap",
"scrape_task_id": "a1b2c3d4-e5f6-4789-0abc-def123456789",
"scrape_check_url": "/api/v1/scraper/tasks/a1b2c3d4-e5f6-4789-0abc-def123456789"
},
{
"url": "https://blog.example.com/post-2",
"source": "sitemap",
"scrape_task_id": "b2c3d4e5-f6a7-89b0-1cde-f234567890ab",
"scrape_check_url": "/api/v1/scraper/tasks/b2c3d4e5-f6a7-89b0-1cde-f234567890ab"
}
],
"credits_used": 11.0,
"credits_remaining": 989.0
}Example 8: Scrape with AI Enhancement
Discover URLs and automatically extract structured data using AI.
Request:
{
"domain": "news.example.com",
"sources": ["sitemap"],
"url_pattern": "/articles/",
"max_urls": 10,
"check_if_live": true,
"async": true,
"scraper_config": {
"mode": "auto",
"content": "markdown",
"ai_enhance": true,
"ai_source": "markdown",
"ai_prompt": "Extract: headline, author, publish_date, summary (2 sentences), tags. Return as JSON."
}
}Response (202 Accepted):
{
"success": true,
"task_id": "c3d4e5f6-a7b8-90c1-2def-3456789abcde",
"status": "processing",
"message": "Crawl task submitted for background processing",
"check_url": "/api/v1/scraper/crawl/tasks/c3d4e5f6-a7b8-90c1-2def-3456789abcde",
"credits_reserved": 7.0
}Completed Response:
{
"success": true,
"domain": "news.example.com",
"discovered_count": 8,
"scraper_tasks_submitted": 8,
"results": [
{
"url": "https://news.example.com/articles/tech-news-2026",
"source": "sitemap",
"scrape_task_id": "d4e5f6a7-b8c9-01d2-3ef4-56789abcdef0",
"scrape_check_url": "/api/v1/scraper/tasks/d4e5f6a7-b8c9-01d2-3ef4-56789abcdef0"
}
],
"credits_used": 6.0,
"credits_remaining": 994.0
}Real-World Scenarios
Example 9: Content Audit - Find All Site Pages
Comprehensive site audit for SEO and content management.
Request:
{
"domain": "mycompany.com",
"sources": ["sitemap", "commoncrawl"],
"max_urls": 1000,
"check_if_live": true,
"async": true
}Response (202 Accepted):
{
"success": true,
"task_id": "e5f6a7b8-c9d0-12e3-4f56-789abcdef012",
"status": "processing",
"message": "Crawl task submitted for background processing",
"check_url": "/api/v1/scraper/crawl/tasks/e5f6a7b8-c9d0-12e3-4f56-789abcdef012",
"credits_reserved": 504.0
}Example 10: Competitor Analysis - Product Pages
Discover all product pages from a competitor’s site.
Request:
{
"domain": "competitor.com",
"sources": ["sitemap"],
"url_pattern": "/(products?|shop|store)/",
"max_urls": 500,
"check_if_live": true
}Response:
{
"success": true,
"domain": "competitor.com",
"discovered_count": 387,
"results": [
{
"url": "https://competitor.com/products/item-1",
"source": "sitemap"
},
{
"url": "https://competitor.com/shop/category/electronics",
"source": "sitemap"
}
],
"credits_used": 195.5,
"credits_remaining": 804.5
}Example 11: Archive Research - Historical URLs
Find historical URLs no longer in active sitemaps.
Request:
{
"domain": "oldsite.com",
"sources": ["commoncrawl"],
"max_urls": 1000,
"check_if_live": false
}Response:
{
"success": true,
"domain": "oldsite.com",
"discovered_count": 856,
"results": [
{
"url": "https://oldsite.com/archived-page-2020",
"source": "commoncrawl"
},
{
"url": "https://oldsite.com/removed-content",
"source": "commoncrawl"
}
],
"credits_used": 2.0,
"credits_remaining": 998.0
}check_if_live: false to avoid validation costs when researching archives.Example 12: Bulk Scraping Workflow
Complete workflow: discover, validate, then scrape product data.
Request:
{
"domain": "ecommerce.example.com",
"sources": ["sitemap"],
"url_pattern": "/products/[a-z0-9-]+$",
"max_urls": 100,
"check_if_live": true,
"async": true,
"scraper_config": {
"mode": "auto",
"proxy_type": "residential",
"proxy_country": "US",
"extract_scheme": [
{
"label": "product_details",
"selector": ".product-info",
"type": "nest",
"fields": [
{
"label": "title",
"selector": "h1.product-title",
"type": "content"
},
{
"label": "price",
"selector": ".price",
"type": "content"
},
{
"label": "in_stock",
"selector": "button.add-to-cart",
"type": "exists"
}
]
}
]
}
}Response (202 Accepted):
{
"success": true,
"task_id": "f6a7b8c9-d0e1-23f4-5678-9abcdef01234",
"status": "processing",
"message": "Crawl task submitted for background processing",
"check_url": "/api/v1/scraper/crawl/tasks/f6a7b8c9-d0e1-23f4-5678-9abcdef01234",
"credits_reserved": 52.0
}Error Responses
Insufficient Credits
Response (402 Payment Required):
{
"success": false,
"error": "Insufficient credits",
"error_code": "INSUFFICIENT_CREDITS",
"message": "Your account has insufficient credits. Required: 254.0, Available: 100.0",
"credits_required": 254.0,
"credits_available": 100.0
}Invalid Domain
Response (400 Bad Request):
{
"success": false,
"error": "Invalid request parameters",
"error_code": "VALIDATION_ERROR",
"message": "domain is required and must be a valid domain name"
}Invalid Pattern
Response (400 Bad Request):
{
"success": false,
"error": "Invalid regex pattern",
"error_code": "VALIDATION_ERROR",
"message": "url_pattern contains invalid regex syntax"
}Best Practices
1. Start Small
Test with small max_urls values to understand costs:
{
"domain": "example.com",
"sources": ["sitemap"],
"max_urls": 10,
"check_if_live": false
}2. Selective Validation
Discover first without validation, then validate only needed URLs:
{
"domain": "example.com",
"sources": ["sitemap"],
"max_urls": 1000,
"check_if_live": false
}3. Use Patterns Efficiently
Filter during discovery, not after:
{
"domain": "blog.example.com",
"url_pattern": "/blog/[0-9]{4}/",
"max_urls": 100
}4. Async for Large Jobs
Use async mode for 1000+ URLs or when scraping:
{
"domain": "largecorp.com",
"max_urls": 5000,
"async": true
}5. Monitor Credits
Check response headers and JSON for credit usage:
X-Credits-Used: 52.0
X-Credits-Remaining: 948.0