Core

The Scraper API offers comprehensive control through a rich set of parameters. This reference covers all available options organized by category.

Parameter Type Required Default Description
url string Yes - Target URL to scrape (must include http:// or https://)
mode string No auto Scraping mode: request, browser, or auto
delivery string No raw Response format: raw (direct content) or json (wrapped)
async boolean No false Submit task for background processing
content string No html Output format: html, markdown, pdf or screenshot
include_content boolean No false Include full content in JSON responses
no_html boolean No false Omit HTML content from response (saves bandwidth)
excluded_tags array/string No [] HTML tag names to remove (e.g., [“script”, “form”])
excluded_selectors array/string No [] CSS selectors to remove (e.g., [".tracker", “#ads”])
extract_scheme object No - Extraction schema using CSS, XPath, and Regex.
js_instructions array No - Structured actions to perform (click, fill, wait)
execute_js string No - Raw JavaScript code to execute
networkCapture array No - Filters to capture browser network responses (max 10 filters)
block_resources array/string No [] Resource to block, options: document, stylesheet, image, media, font, script, texttrack, xhr, fetch, eventsource, websocket, manifest

Proxy Configuration

Parameter Type Default Description
proxy_type string residential Proxy type: residential or datacenter
proxy_country string US Two-letter ISO country code (e.g., GB, DE, JP)
proxy_session_id string - Session ID (6-8 characters) for IP persistence
proxy_overwrite string - Full proxy URL to override default proxies

Browser Timing

Parameter Type Default Range Description
wait_seconds integer 0 0-30 Seconds to wait after page load
wait_until string domcontentloaded - Page ready state options
domcontentloaded - - - DOM is ready (recommended) - Most dynamic sites
load - - - All resources loaded - Complete page with images
networkidle - - - No network activity - Heavy AJAX/API calls
commit - - - Navigation committed (fastest) - Simple static pages

Example:

{
  "mode": "browser",
  "wait_until": "networkidle",
  "wait_seconds": 3
}

Device Emulation

Parameter Type Default Description
device string windows Device to emulate options
windows - - Viewport: 1920×1080, User-Agent: Chrome/Windows, Scrolling: Enabled (5000px), Best For: Desktop sites, dashboards
macos - - Viewport: 1920×1080, User-Agent: Safari/macOS, Scrolling: Enabled (5000px), Best For: macOS-specific content
android - - Viewport: 375×667, User-Agent: Chrome/Android, Scrolling: Disabled, Best For: Mobile-first sites, apps

AI Enhancement

Parameter Type Default Required When Description
ai_enhance boolean false - Enable AI processing
ai_source string - ai_enhance=true Source: markdown or screenshot
ai_prompt string - No Custom prompt for AI processing
ai_force_json boolean true No Force JSON output format

Custom Headers

Parameter Type Default Description
additional_headers object - Custom HTTP headers to send with the request

Example:

{
  "additional_headers": {
    "Authorization": "Bearer token123",
    "Accept-Language": "en-US,en;q=0.9",
    "Custom-Header": "value"
  }
}

Common use cases: API authentication, language preferences, custom tracking headers.

Visual Capture

Parameter Type Default Description
screenshot boolean false Capture full-page screenshot (requires mode=browser)

Screenshots are:

  • Full-page PNG images (not just viewport)
  • Stored in Cloudflare R2 for 30 days
  • Accessible via screenshot_uri in response
  • Cost: +1 credit

PDF Capture

Parameter Type Default Description
pdf boolean false Capture full-page PDF document (requires mode=browser)

PDFs are:

  • Full-page vector documents (searchable and scalable)
  • Stored in Cloudflare R2 for 30 days
  • Accessible via pdf_uri in response
  • Cost: +1 credit

Content Formats

html - Raw HTML with absolute URLs

markdown - Clean, readable text format (ideal for AI processing)

screenshot - Full-page PNG image (requires mode=browser)

pdf - Full-page PDF document (requires mode=browser)

Parameter Validation

The API validates all parameters before processing. Common validation errors:

  • Invalid URL: Must include http:// or https://
  • Invalid mode: Must be request, browser, or auto
  • Invalid proxy_country: Must be a 2-letter ISO code
  • Invalid proxy_session_id: Must be 6-8 alphanumeric characters
  • Capture without browser: screenshot=true | pdf=true require mode=browser
  • AI without source: ai_enhance=true requires ai_source
  • Invalid filter: excluded_tags | excluded_selectors require content=markdown | content=html

Validation errors return 422 Unprocessable Entity with details in the response.