Output Formats

Scraper API

Output Formats

The Scraper API offers flexible output formats to match your use case—from raw HTML for downstream processing to clean Markdown for AI applications, or full-page screenshots for visual analysis.

Content Formats

Control the output format with the content parameter:

HTML (Default)

Raw HTML with absolute URLs, perfect for parsing or archival.

Value: html

curl "https://scrape.evomi.com/api/v1/scraper/realtime?url=https://example.com&content=html&api_key=YOUR_API_KEY"

Characteristics:

Complete HTML document with <!DOCTYPE>, <html>, etc.
All relative URLs converted to absolute
Includes meta tags, scripts, and styles
Original page structure preserved

Best for:

HTML parsing with BeautifulSoup, Cheerio, etc.
Archiving complete pages
Custom processing pipelines
Extracting structured data with CSS selectors

Markdown

Clean, readable text format ideal for AI processing and content analysis.

Value: markdown

curl "https://scrape.evomi.com/api/v1/scraper/realtime?url=https://blog.example.com/post&content=markdown&api_key=YOUR_API_KEY"

Characteristics:

Removes boilerplate (headers, footers, navigation)
Preserves structure (headings, lists, tables)
Converts links to markdown format: [text](url)
Strips scripts, styles, and unnecessary elements
Makes URLs absolute
Lightweight and token-efficient

Best for:

LLM/AI input (GPT, Claude, etc.)
Content summarization
Text analysis and NLP
Knowledge base extraction
Article archival

Example output:

# Blog Post Title

Published on January 15, 2025

This is the main content with preserved **formatting** and [links](https://example.com).

## Section Heading

- Bullet point one
- Bullet point two

More content here...

Screenshot

Full-page PNG image capture.

Value: screenshot

curl -X POST "https://scrape.evomi.com/api/v1/scraper/realtime" \
  -H "x-api-key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com",
    "mode": "browser",
    "content": "screenshot"
  }' \
  -o screenshot.png

Characteristics:

PNG format
Full page (not just viewport)
Captures everything a user would see
Stored in CDN for 30 days
Available as URI and base64 data

Requirements:

Must use mode=browser or mode=auto
Automatically sets screenshot=true

Cost: +1 credit (additive)

Best for:

Visual documentation
UI/UX testing
Page monitoring and change detection
Visual AI processing
Compliance and archival

Delivery Modes

Control how the response is delivered with the delivery parameter:

Raw Delivery (Default)

Returns content directly with appropriate Content-Type header.

Value: raw

curl "https://scrape.evomi.com/api/v1/scraper/realtime?url=https://example.com&delivery=raw&api_key=YOUR_API_KEY"

Response:

Content-Type: text/html; charset=utf-8
X-Credits-Used: 1.0
X-Credits-Remaining: 99.0

<!DOCTYPE html>
<html>
...

Content-Type mapping:

Content Format	Content-Type
`html`	`text/html; charset=utf-8`
`markdown`	`text/markdown; charset=utf-8`
`screenshot`	`image/png`

Best for:

Direct file saving
Piping to other tools
Minimal overhead
Browser viewing

JSON Delivery

Returns structured JSON with metadata and optional content.

Value: json

curl "https://scrape.evomi.com/api/v1/scraper/realtime?url=https://example.com&delivery=json&api_key=YOUR_API_KEY"

Default response (without content):

{
  "success": true,
  "id": "task_abc123",
  "url": "https://example.com",
  "domain": "example.com",
  "title": "Example Domain",
  "meta": {
    "description": "Example page description",
    "keywords": "example, demo"
  },
  "status_code": 200,
  "credits_used": 1.0,
  "credits_remaining": 99.0,
  "mode_used": "auto (request)",
  "hints": ["Content omitted by default. Set include_content=true to include it."]
}

ℹ️

By default, JSON delivery omits the full content to save bandwidth and reduce response size. Add include_content=true to include it.

With content included:

curl "https://scrape.evomi.com/api/v1/scraper/realtime?url=https://example.com&delivery=json&include_content=true&api_key=YOUR_API_KEY"

{
  "success": true,
  "url": "https://example.com",
  "domain": "example.com",
  "title": "Example Domain",
  "content": "<!DOCTYPE html>...",
  "status_code": 200,
  "credits_used": 1.0,
  "credits_remaining": 99.0,
  "mode_used": "auto (request)"
}

Best for:

Structured API responses
Metadata extraction
Credit tracking
Status code checking
Multi-field responses

Format Combinations

Content	Delivery	Result
`html`	`raw`	Raw HTML with `text/html` Content-Type
`html`	`json`	JSON with HTML in `content` field (if `include_content=true`)
`markdown`	`raw`	Raw Markdown with `text/markdown` Content-Type
`markdown`	`json`	JSON with Markdown in `content` field (if `include_content=true`)
`screenshot`	`raw`	PNG binary with `image/png` Content-Type
`screenshot`	`json`	JSON with `screenshot_uri` field

Use Case Examples

1. Save HTML to File

curl "https://scrape.evomi.com/api/v1/scraper/realtime?url=https://example.com&content=html&delivery=raw&api_key=YOUR_API_KEY" \
  -o page.html

2. Get Metadata Only (No Content)

Perfect for checking pages before full scraping:

curl "https://scrape.evomi.com/api/v1/scraper/realtime?url=https://example.com&delivery=json&api_key=YOUR_API_KEY"

Returns title, status code, credits used, but no content—saves bandwidth and money.

3. Extract Clean Text for AI

curl "https://scrape.evomi.com/api/v1/scraper/realtime?url=https://blog.example.com/post&content=markdown&delivery=raw&api_key=YOUR_API_KEY" \
  | python process_with_ai.py

4. Get Screenshot with Metadata

curl -X POST "https://scrape.evomi.com/api/v1/scraper/realtime" \
  -H "x-api-key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com",
    "mode": "browser",
    "screenshot": true,
    "delivery": "json"
  }'

Response:

{
  "success": true,
  "url": "https://example.com",
  "screenshot_uri": "https://scraper-results.evomi.com/screenshots/task_abc123.png",
  "credits_used": 6.0,
  "credits_remaining": 94.0
}

5. Markdown with Full JSON Response

curl -X POST "https://scrape.evomi.com/api/v1/scraper/realtime" \
  -H "x-api-key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://blog.example.com/post",
    "content": "markdown",
    "delivery": "json",
    "include_content": true
  }'

Response:

{
  "success": true,
  "url": "https://blog.example.com/post",
  "title": "How to Build Scalable APIs",
  "content": "# How to Build Scalable APIs\n\nPublished...",
  "meta": {
    "description": "Learn best practices...",
    "author": "John Doe"
  },
  "credits_used": 1.0
}

Special Response Fields

With Screenshots

When screenshots are captured:

{
  "screenshot_uri": "https://scraper-results.evomi.com/screenshots/task_abc123.png",
  "screenshot_data": "iVBORw0KGgoAAAANSUhEUgAA..." // base64 encoded
}

screenshot_uri: Public CDN URL (available for 30 days)
screenshot_data: Base64 encoded PNG data (in JSON responses)

With AI Enhancement

When AI processing is enabled:

{
  "ai_response": {
    "summary": "This article explains...",
    "key_points": ["Point 1", "Point 2"],
    "headline": "Main Title"
  }
}

See AI Enhancement for details.

With JavaScript Execution

When JavaScript is executed:

{
  "js_result": "Return value from execute_js",
  "js_instructions_result": [
    {"action": "click", "selector": "#button", "status": "success"}
  ]
}

Best Practices

Choose the Right Format

Use HTML when:

Parsing with DOM libraries (BeautifulSoup, Cheerio)
Need exact page structure
Building web archives
Custom processing required

Use Markdown when:

Feeding content to AI/LLMs
Text analysis or NLP
Content summarization
Human-readable archives

Use Screenshots when:

Visual documentation needed
UI/UX testing
Compliance requirements
Visual AI processing

Optimize Bandwidth

For metadata checks:

{"delivery": "json"}

Omits content, returns only metadata.

For full responses:

{"delivery": "json", "include_content": true}

Includes everything.

For direct content:

{"delivery": "raw"}

No JSON wrapper, just content.

Save Files Correctly

HTML:

curl "..." -o page.html

Markdown:

curl "..." -o article.md

Screenshot:

curl "..." -o screenshot.png

⚠️

Screenshots are stored for 30 days. Download and store them locally if you need longer retention.

Error Handling JavaScript Automation