Core
The Scraper API offers comprehensive control through a rich set of parameters. This reference covers all available options organized by category.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
url |
string | Yes | - | Target URL to scrape (must include http:// or https://) |
mode |
string | No | auto |
Scraping mode: request, browser, or auto |
delivery |
string | No | raw |
Response format: raw (direct content) or json (wrapped) |
async |
boolean | No | false |
Submit task for background processing |
content |
string | No | html |
Output format: html, markdown, pdf or screenshot |
include_content |
boolean | No | false |
Include full content in JSON responses |
no_html |
boolean | No | false |
Omit HTML content from response (saves bandwidth) |
excluded_tags |
array/string | No | [] | HTML tag names to remove (e.g., [“script”, “form”]) |
excluded_selectors |
array/string | No | [] | CSS selectors to remove (e.g., [".tracker", “#ads”]) |
extract_scheme |
object | No | - | Extraction schema using CSS, XPath, and Regex. |
js_instructions |
array | No | - | Structured actions to perform (click, fill, wait) |
execute_js |
string | No | - | Raw JavaScript code to execute |
networkCapture |
array | No | - | Filters to capture browser network responses (max 10 filters) |
block_resources |
array/string | No | [] | Resource to block, options: document, stylesheet, image, media, font, script, texttrack, xhr, fetch, eventsource, websocket, manifest |
Proxy Configuration
| Parameter | Type | Default | Description |
|---|---|---|---|
proxy_type |
string | residential |
Proxy type: residential or datacenter |
proxy_country |
string | US |
Two-letter ISO country code (e.g., GB, DE, JP) |
proxy_session_id |
string | - | Session ID (6-8 characters) for IP persistence |
proxy_overwrite |
string | - | Full proxy URL to override default proxies |
Browser Timing
| Parameter | Type | Default | Range | Description |
|---|---|---|---|---|
wait_seconds |
integer | 0 |
0-30 | Seconds to wait after page load |
wait_until |
string | domcontentloaded |
- | Page ready state options |
domcontentloaded |
- | - | - | DOM is ready (recommended) - Most dynamic sites |
load |
- | - | - | All resources loaded - Complete page with images |
networkidle |
- | - | - | No network activity - Heavy AJAX/API calls |
commit |
- | - | - | Navigation committed (fastest) - Simple static pages |
Example:
{
"mode": "browser",
"wait_until": "networkidle",
"wait_seconds": 3
}Device Emulation
| Parameter | Type | Default | Description |
|---|---|---|---|
device |
string | windows |
Device to emulate options |
windows |
- | - | Viewport: 1920×1080, User-Agent: Chrome/Windows, Scrolling: Enabled (5000px), Best For: Desktop sites, dashboards |
macos |
- | - | Viewport: 1920×1080, User-Agent: Safari/macOS, Scrolling: Enabled (5000px), Best For: macOS-specific content |
android |
- | - | Viewport: 375×667, User-Agent: Chrome/Android, Scrolling: Disabled, Best For: Mobile-first sites, apps |
AI Enhancement
| Parameter | Type | Default | Required When | Description |
|---|---|---|---|---|
ai_enhance |
boolean | false |
- | Enable AI processing |
ai_source |
string | - | ai_enhance=true | Source: markdown or screenshot |
ai_prompt |
string | - | No | Custom prompt for AI processing |
ai_force_json |
boolean | true |
No | Force JSON output format |
Custom Headers
| Parameter | Type | Default | Description |
|---|---|---|---|
additional_headers |
object | - | Custom HTTP headers to send with the request |
Example:
{
"additional_headers": {
"Authorization": "Bearer token123",
"Accept-Language": "en-US,en;q=0.9",
"Custom-Header": "value"
}
}Common use cases: API authentication, language preferences, custom tracking headers.
Visual Capture
| Parameter | Type | Default | Description |
|---|---|---|---|
screenshot |
boolean | false |
Capture full-page screenshot (requires mode=browser) |
Screenshots are:
- Full-page PNG images (not just viewport)
- Stored in Cloudflare R2 for 30 days
- Accessible via
screenshot_uriin response - Cost: +1 credit
PDF Capture
| Parameter | Type | Default | Description |
|---|---|---|---|
pdf |
boolean | false |
Capture full-page PDF document (requires mode=browser) |
PDFs are:
- Full-page vector documents (searchable and scalable)
- Stored in Cloudflare R2 for 30 days
- Accessible via
pdf_uriin response - Cost: +1 credit
Content Formats
html - Raw HTML with absolute URLs
markdown - Clean, readable text format (ideal for AI processing)
screenshot - Full-page PNG image (requires mode=browser)
pdf - Full-page PDF document (requires mode=browser)
Parameter Validation
The API validates all parameters before processing. Common validation errors:
- Invalid URL: Must include
http://orhttps:// - Invalid mode: Must be
request,browser, orauto - Invalid proxy_country: Must be a 2-letter ISO code
- Invalid proxy_session_id: Must be 6-8 alphanumeric characters
- Capture without browser:
screenshot=true|pdf=truerequiremode=browser - AI without source:
ai_enhance=truerequiresai_source - Invalid filter:
excluded_tags|excluded_selectorsrequirecontent=markdown|content=html
Validation errors return 422 Unprocessable Entity with details in the response.