Parameters

The Scraper API offers comprehensive control through a rich set of parameters. This reference covers all available options organized by category.

Core Parameters

Parameter Type Required Default Description
url string Yes - Target URL to scrape (must include http:// or https://)
mode string No auto Scraping mode: request, browser, or auto
delivery string No raw Response format: raw (direct content) or json (wrapped)
async boolean No false Submit task for background processing

Scraping Modes

request - Fast HTTP request without JavaScript

  • Best for static sites (blogs, news, documentation)
  • Lowest cost (1× multiplier)
  • Timeout: 30 seconds

browser - Full browser with JavaScript rendering

  • Best for SPAs (React, Vue, Angular)
  • Supports screenshots and JavaScript automation
  • Higher cost (5× multiplier)
  • Timeout: 45 seconds

auto (default) - Intelligent detection

  • Tries HTTP request first, upgrades to browser if needed
  • Optimizes cost automatically
  • Charges 50% of request cost + full browser cost when upgraded

Output Control

Parameter Type Default Description
content string html Output format: html, markdown, or screenshot
include_content boolean false Include full content in JSON responses
no_html boolean false Omit HTML content from response (saves bandwidth)

Content Formats

html - Raw HTML with absolute URLs markdown - Clean, readable text format (ideal for AI processing) screenshot - Full-page PNG image (requires mode=browser)

ℹ️
JSON delivery omits content by default. Set include_content=true to include it, or use delivery=raw for direct content.

Proxy Configuration

Parameter Type Default Description
proxy_type string residential Proxy type: residential or datacenter
proxy_country string US Two-letter ISO country code (e.g., GB, DE, JP)
proxy_session_id string - Session ID (6-8 characters) for IP persistence
proxy_overwrite string - Full proxy URL to override default proxies

Proxy Types Comparison

Feature Residential Datacenter
Cost Multiplier
Success Rate Highest High
Speed Fast Fastest
Best For Geo-restricted content, anti-bot sites Public APIs, high-volume scraping

Session Management Example:

{
  "proxy_session_id": "cart123",
  "proxy_country": "US"
}

Use the same session ID across multiple requests to maintain the same IP address—perfect for login flows or shopping carts.

See Proxy Types for detailed information.

Browser Timing

Parameter Type Default Range Description
wait_seconds integer 0 0-30 Seconds to wait after page load
wait_until string domcontentloaded See below Page ready state

Wait Until Options

Value Description Use When
domcontentloaded DOM is ready (recommended) Most dynamic sites
load All resources loaded Complete page with images
networkidle No network activity Heavy AJAX/API calls
commit Navigation committed (fastest) Simple static pages

Example:

{
  "mode": "browser",
  "wait_until": "networkidle",
  "wait_seconds": 3
}

Device Emulation

Parameter Type Default Description
device string windows Device to emulate: windows, macos, or android

Device Characteristics

Device Viewport User-Agent Scrolling Best For
windows 1920×1080 Chrome/Windows Enabled (5000px) Desktop sites, dashboards
macos 1920×1080 Safari/macOS Enabled (5000px) macOS-specific content
android 375×667 Chrome/Android Disabled Mobile-first sites, apps

Performance Optimization

Parameter Type Default Description
block_resources array/string [] Resource types to block for faster loading

Blockable Resource Types

Block unnecessary resources to speed up scraping and reduce costs:

{
  "block_resources": ["image", "stylesheet", "font", "media"]
}

Available types: document, stylesheet, image, media, font, script, texttrack, xhr, fetch, eventsource, websocket, manifest

⚠️
Blocking script resources will prevent JavaScript execution. Only use this with mode=request or for specific use cases.

Custom Headers

Parameter Type Default Description
additional_headers object - Custom HTTP headers to send with the request

Example:

{
  "additional_headers": {
    "Authorization": "Bearer token123",
    "Accept-Language": "en-US,en;q=0.9",
    "Custom-Header": "value"
  }
}

Common use cases: API authentication, language preferences, custom tracking headers.

Visual Capture

Parameter Type Default Description
screenshot boolean false Capture full-page screenshot (requires mode=browser)

Screenshots are:

  • Full-page PNG images (not just viewport)
  • Stored in Cloudflare R2 for 30 days
  • Accessible via screenshot_uri in response
  • Cost: +1 credit

AI Enhancement

Parameter Type Default Required When Description
ai_enhance boolean false - Enable AI processing
ai_source string - ai_enhance=true Source: markdown or screenshot
ai_prompt string - No Custom prompt for AI processing
ai_force_json boolean true No Force JSON output format

AI Model: Google Gemini 2.0 Flash
Cost: +10 credits (additive)

Example:

{
  "ai_enhance": true,
  "ai_source": "markdown",
  "ai_prompt": "Extract product name, price, and availability as JSON"
}

See AI Enhancement for detailed examples.

JavaScript Automation

Parameter Type Default Description
js_instructions array - Structured actions to perform (click, fill, wait)
execute_js string - Raw JavaScript code to execute

Supported Actions:

  • {"click": "selector"} - Click an element
  • {"wait": milliseconds} - Wait for duration
  • {"fill": ["selector", "value"]} - Fill form field
  • {"wait_for": "selector"} - Wait for element to appear

Example:

{
  "js_instructions": [
    {"wait_for": "#search-input"},
    {"fill": ["#search-input", "web scraping"]},
    {"click": "#search-button"},
    {"wait": 2000}
  ]
}

See JavaScript Automation for comprehensive examples.

Parameter Combinations

Fast Static Scraping (1 credit)

{
  "url": "https://example.com",
  "mode": "request",
  "proxy_type": "datacenter"
}

JavaScript Site with Screenshot (6 credits)

{
  "url": "https://spa-website.com",
  "mode": "browser",
  "screenshot": true,
  "proxy_type": "datacenter",
  "wait_until": "networkidle"
}

AI-Powered Content Extraction (15 credits)

{
  "url": "https://news.site.com/article",
  "mode": "browser",
  "content": "markdown",
  "ai_enhance": true,
  "ai_source": "markdown",
  "ai_prompt": "Summarize in 3 bullet points",
  "proxy_type": "datacenter"
}

High-Success Residential Proxy (25 credits)

{
  "url": "https://protected-site.com",
  "mode": "browser",
  "proxy_type": "residential",
  "proxy_country": "US",
  "wait_until": "networkidle",
  "block_resources": ["image", "font"]
}

Parameter Validation

The API validates all parameters before processing. Common validation errors:

  • Invalid URL: Must include http:// or https://
  • Invalid mode: Must be request, browser, or auto
  • Invalid proxy_country: Must be a 2-letter ISO code
  • Invalid proxy_session_id: Must be 6-8 alphanumeric characters
  • Screenshot without browser: screenshot=true requires mode=browser or mode=auto
  • AI without source: ai_enhance=true requires ai_source

Validation errors return 422 Unprocessable Entity with details in the response.