Output Formats

The Scraper API offers flexible output formats to match your use case—from raw HTML for downstream processing to clean Markdown for AI applications, or full-page screenshots for visual analysis.

Content Formats

Control the output format with the content parameter:

HTML (Default)

Raw HTML with absolute URLs, perfect for parsing or archival.

Value: html

curl "https://scrape.evomi.com/api/v1/scraper/realtime?url=https://example.com&content=html&api_key=YOUR_API_KEY"

Characteristics:

  • Complete HTML document with <!DOCTYPE>, <html>, etc.
  • All relative URLs converted to absolute
  • Includes meta tags, scripts, and styles
  • Original page structure preserved

Best for:

  • HTML parsing with BeautifulSoup, Cheerio, etc.
  • Archiving complete pages
  • Custom processing pipelines
  • Extracting structured data with CSS selectors

Markdown

Clean, readable text format ideal for AI processing and content analysis.

Value: markdown

curl "https://scrape.evomi.com/api/v1/scraper/realtime?url=https://blog.example.com/post&content=markdown&api_key=YOUR_API_KEY"

Characteristics:

  • Removes boilerplate (headers, footers, navigation)
  • Preserves structure (headings, lists, tables)
  • Converts links to markdown format: [text](url)
  • Strips scripts, styles, and unnecessary elements
  • Makes URLs absolute
  • Lightweight and token-efficient

Best for:

  • LLM/AI input (GPT, Claude, etc.)
  • Content summarization
  • Text analysis and NLP
  • Knowledge base extraction
  • Article archival

Example output:

# Blog Post Title

Published on January 15, 2025

This is the main content with preserved **formatting** and [links](https://example.com).

## Section Heading

- Bullet point one
- Bullet point two

More content here...

Screenshot

Full-page PNG image capture.

Value: screenshot

curl -X POST "https://scrape.evomi.com/api/v1/scraper/realtime" \
  -H "x-api-key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com",
    "mode": "browser",
    "content": "screenshot"
  }' \
  -o screenshot.png

Characteristics:

  • PNG format
  • Full page (not just viewport)
  • Captures everything a user would see
  • Stored in CDN for 30 days
  • Available as URI and base64 data

Requirements:

  • Must use mode=browser or mode=auto
  • Automatically sets screenshot=true

Cost: +1 credit (additive)

Best for:

  • Visual documentation
  • UI/UX testing
  • Page monitoring and change detection
  • Visual AI processing
  • Compliance and archival

Delivery Modes

Control how the response is delivered with the delivery parameter:

Raw Delivery (Default)

Returns content directly with appropriate Content-Type header.

Value: raw

curl "https://scrape.evomi.com/api/v1/scraper/realtime?url=https://example.com&delivery=raw&api_key=YOUR_API_KEY"

Response:

Content-Type: text/html; charset=utf-8
X-Credits-Used: 1.0
X-Credits-Remaining: 99.0

<!DOCTYPE html>
<html>
...

Content-Type mapping:

Content Format Content-Type
html text/html; charset=utf-8
markdown text/markdown; charset=utf-8
screenshot image/png

Best for:

  • Direct file saving
  • Piping to other tools
  • Minimal overhead
  • Browser viewing

JSON Delivery

Returns structured JSON with metadata and optional content.

Value: json

curl "https://scrape.evomi.com/api/v1/scraper/realtime?url=https://example.com&delivery=json&api_key=YOUR_API_KEY"

Default response (without content):

{
  "success": true,
  "id": "task_abc123",
  "url": "https://example.com",
  "domain": "example.com",
  "title": "Example Domain",
  "meta": {
    "description": "Example page description",
    "keywords": "example, demo"
  },
  "status_code": 200,
  "credits_used": 1.0,
  "credits_remaining": 99.0,
  "mode_used": "auto (request)",
  "hints": ["Content omitted by default. Set include_content=true to include it."]
}
ℹ️
By default, JSON delivery omits the full content to save bandwidth and reduce response size. Add include_content=true to include it.

With content included:

curl "https://scrape.evomi.com/api/v1/scraper/realtime?url=https://example.com&delivery=json&include_content=true&api_key=YOUR_API_KEY"
{
  "success": true,
  "url": "https://example.com",
  "domain": "example.com",
  "title": "Example Domain",
  "content": "<!DOCTYPE html>...",
  "status_code": 200,
  "credits_used": 1.0,
  "credits_remaining": 99.0,
  "mode_used": "auto (request)"
}

Best for:

  • Structured API responses
  • Metadata extraction
  • Credit tracking
  • Status code checking
  • Multi-field responses

Format Combinations

Content Delivery Result
html raw Raw HTML with text/html Content-Type
html json JSON with HTML in content field (if include_content=true)
markdown raw Raw Markdown with text/markdown Content-Type
markdown json JSON with Markdown in content field (if include_content=true)
screenshot raw PNG binary with image/png Content-Type
screenshot json JSON with screenshot_uri field

Use Case Examples

1. Save HTML to File

curl "https://scrape.evomi.com/api/v1/scraper/realtime?url=https://example.com&content=html&delivery=raw&api_key=YOUR_API_KEY" \
  -o page.html

2. Get Metadata Only (No Content)

Perfect for checking pages before full scraping:

curl "https://scrape.evomi.com/api/v1/scraper/realtime?url=https://example.com&delivery=json&api_key=YOUR_API_KEY"

Returns title, status code, credits used, but no content—saves bandwidth and money.

3. Extract Clean Text for AI

curl "https://scrape.evomi.com/api/v1/scraper/realtime?url=https://blog.example.com/post&content=markdown&delivery=raw&api_key=YOUR_API_KEY" \
  | python process_with_ai.py

4. Get Screenshot with Metadata

curl -X POST "https://scrape.evomi.com/api/v1/scraper/realtime" \
  -H "x-api-key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com",
    "mode": "browser",
    "screenshot": true,
    "delivery": "json"
  }'

Response:

{
  "success": true,
  "url": "https://example.com",
  "screenshot_uri": "https://scraper-results.evomi.com/screenshots/task_abc123.png",
  "credits_used": 6.0,
  "credits_remaining": 94.0
}

5. Markdown with Full JSON Response

curl -X POST "https://scrape.evomi.com/api/v1/scraper/realtime" \
  -H "x-api-key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://blog.example.com/post",
    "content": "markdown",
    "delivery": "json",
    "include_content": true
  }'

Response:

{
  "success": true,
  "url": "https://blog.example.com/post",
  "title": "How to Build Scalable APIs",
  "content": "# How to Build Scalable APIs\n\nPublished...",
  "meta": {
    "description": "Learn best practices...",
    "author": "John Doe"
  },
  "credits_used": 1.0
}

Special Response Fields

With Screenshots

When screenshots are captured:

{
  "screenshot_uri": "https://scraper-results.evomi.com/screenshots/task_abc123.png",
  "screenshot_data": "iVBORw0KGgoAAAANSUhEUgAA..." // base64 encoded
}
  • screenshot_uri: Public CDN URL (available for 30 days)
  • screenshot_data: Base64 encoded PNG data (in JSON responses)

With AI Enhancement

When AI processing is enabled:

{
  "ai_response": {
    "summary": "This article explains...",
    "key_points": ["Point 1", "Point 2"],
    "headline": "Main Title"
  }
}

See AI Enhancement for details.

With JavaScript Execution

When JavaScript is executed:

{
  "js_result": "Return value from execute_js",
  "js_instructions_result": [
    {"action": "click", "selector": "#button", "status": "success"}
  ]
}

Best Practices

Choose the Right Format

Use HTML when:

  • Parsing with DOM libraries (BeautifulSoup, Cheerio)
  • Need exact page structure
  • Building web archives
  • Custom processing required

Use Markdown when:

  • Feeding content to AI/LLMs
  • Text analysis or NLP
  • Content summarization
  • Human-readable archives

Use Screenshots when:

  • Visual documentation needed
  • UI/UX testing
  • Compliance requirements
  • Visual AI processing

Optimize Bandwidth

For metadata checks:

{"delivery": "json"}

Omits content, returns only metadata.

For full responses:

{"delivery": "json", "include_content": true}

Includes everything.

For direct content:

{"delivery": "raw"}

No JSON wrapper, just content.

Save Files Correctly

HTML:

curl "..." -o page.html

Markdown:

curl "..." -o article.md

Screenshot:

curl "..." -o screenshot.png
⚠️
Screenshots are stored for 30 days. Download and store them locally if you need longer retention.