AI Enhancement

Scraper API

AI Enhancement

Transform raw web content into structured insights with built-in AI processing. Extract summaries, generate structured data, and analyze content—all in a single API call.

AI Model: Google Gemini 2.0 Flash
Cost: +30 credits (additive to base cost)

Quick Start

Enable AI processing by setting ai_enhance=true and specifying a source:

curl -X POST "https://scrape.evomi.com/api/v1/scraper/realtime" \
  -H "x-api-key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://news.example.com/article",
    "ai_enhance": true,
    "ai_source": "markdown",
    "ai_prompt": "Extract headline, author, date, and create a 2-sentence summary as JSON"
  }'

Parameters

Parameter	Required	Default	Description
`ai_enhance`	Yes	`false`	Enable AI processing
`ai_source`	Yes*	-	Source for AI: `markdown` or `screenshot`
`ai_prompt`	No	-	Custom instructions for AI
`ai_force_json`	No	`true`	Force JSON output format

*Required when ai_enhance=true

AI Sources

Markdown Source

Process the text content of the page.

Value: markdown

{
  "url": "https://blog.example.com/post",
  "ai_enhance": true,
  "ai_source": "markdown"
}

Best for:

Article summarization
Data extraction from text
Content classification
Information extraction
Structured data generation
Q&A from documentation

Characteristics:

Processes clean text (no HTML boilerplate)
Token-efficient
Fast processing
Ideal for text-heavy content

Screenshot Source

Process the visual appearance of the page using vision AI.

Value: screenshot

Requirements:

Must set screenshot=true
Requires mode=browser or mode=auto

{
  "url": "https://example.com",
  "mode": "browser",
  "screenshot": true,
  "ai_enhance": true,
  "ai_source": "screenshot"
}

Best for:

Layout analysis
Visual content extraction
Image-based content
UI element detection
Design analysis
Charts and graphs

Characteristics:

Processes visual representation
Can extract text from images
Understands layout and design
Higher processing cost

Custom Prompts

Guide AI processing with custom instructions via ai_prompt:

Structured Data Extraction

{
  "ai_prompt": "Extract the article headline, author, publication date, and create a 2-sentence summary. Return as JSON with keys: headline, author, date, summary."
}

Example Response:

{
  "ai_response": {
    "headline": "The Future of Web Scraping",
    "author": "Jane Doe",
    "date": "2025-01-15",
    "summary": "Web scraping is evolving with AI integration. Modern APIs now offer built-in intelligence for data extraction."
  }
}

Bullet Point Summary

{
  "ai_prompt": "Summarize the main points in 5 concise bullet points."
}

Example Response:

{
  "ai_response": {
    "summary": [
      "Web scraping is becoming more intelligent",
      "AI helps extract structured data automatically",
      "Modern APIs combine scraping and AI processing",
      "Costs are predictable with transparent pricing",
      "Integration is simple with REST APIs"
    ]
  }
}

Contact Information

{
  "ai_prompt": "Extract all contact information including email, phone, address, and social media links. Return as JSON."
}

Product Data

{
  "ai_prompt": "Extract all product names, prices, and availability status. Return as a JSON array with fields: name, price, available."
}

Visual Analysis (Screenshot Source)

{
  "ai_source": "screenshot",
  "ai_prompt": "Describe the page layout, identify main sections, and list all call-to-action buttons with their text and position."
}

Default Behavior

Without a custom prompt, AI provides intelligent defaults based on source:

Default for Markdown

{
  "title": "Page Title",
  "description": "Brief description",
  "summary": "Concise content summary",
  "key_points": [
    "Main point 1",
    "Main point 2",
    "Main point 3"
  ]
}

Default for Screenshot

{
  "layout_description": "Visual layout description",
  "visible_text": ["Text element 1", "Text element 2"],
  "ui_elements": ["Button", "Form", "Navigation"],
  "call_to_action": ["Sign Up", "Learn More"]
}

Response Formats

With delivery=raw (Default)

Returns ONLY the AI response:

curl -X POST "https://scrape.evomi.com/api/v1/scraper/realtime" \
  -H "x-api-key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com",
    "ai_enhance": true,
    "ai_source": "markdown",
    "delivery": "raw"
  }'

Response:

{
  "title": "Example Domain",
  "summary": "This domain is for use in illustrative examples...",
  "key_points": ["Used for examples", "No prior coordination needed"]
}

Headers:

Content-Type: application/json; charset=utf-8
X-Response-Source: ai-raw-mode
X-Credits-Used: 15.0

With delivery=json

Returns AI response in ai_response field:

curl -X POST "https://scrape.evomi.com/api/v1/scraper/realtime" \
  -H "x-api-key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com",
    "ai_enhance": true,
    "ai_source": "markdown",
    "delivery": "json"
  }'

Response:

{
  "success": true,
  "url": "https://example.com",
  "ai_response": {
    "title": "Example Domain",
    "summary": "This domain is for use in illustrative examples...",
    "key_points": ["Used for examples", "No prior coordination needed"]
  },
  "credits_used": 15.0,
  "credits_remaining": 85.0,
  "mode_used": "browser"
}

Force JSON Output

By default, ai_force_json=true instructs AI to return valid JSON.

For free-form text responses:

{
  "ai_enhance": true,
  "ai_source": "markdown",
  "ai_force_json": false,
  "ai_prompt": "Write a creative short story inspired by this content."
}

Use Cases

1. News Article Extraction

{
  "url": "https://news.example.com/article/123",
  "ai_enhance": true,
  "ai_source": "markdown",
  "ai_prompt": "Extract: headline, author, date, category, tags, and 2-sentence summary. Return as JSON."
}

2. E-commerce Product Scraping

{
  "url": "https://shop.example.com/product/456",
  "ai_enhance": true,
  "ai_source": "markdown",
  "ai_prompt": "Extract product name, price, description, specifications, and availability. Return as structured JSON."
}

3. Job Posting Analysis

{
  "url": "https://jobs.example.com/posting/789",
  "ai_enhance": true,
  "ai_source": "markdown",
  "ai_prompt": "Extract: job title, company, location, salary range, requirements, and responsibilities as JSON."
}

4. Business Directory

{
  "url": "https://directory.example.com/business/abc",
  "ai_enhance": true,
  "ai_source": "markdown",
  "ai_prompt": "Extract business name, address, phone, email, website, hours, and services as JSON."
}

5. Documentation Summary

{
  "url": "https://docs.example.com/guide",
  "ai_enhance": true,
  "ai_source": "markdown",
  "ai_prompt": "Create a concise summary of this documentation page with 5 key takeaways."
}

6. Visual Layout Analysis

{
  "url": "https://example.com",
  "mode": "browser",
  "screenshot": true,
  "ai_enhance": true,
  "ai_source": "screenshot",
  "ai_prompt": "Analyze the page layout. Identify header, navigation, main content area, sidebar, and footer. List all visible call-to-action buttons."
}

Cost Calculation

AI enhancement adds 30 credits to your base cost:

Configuration	Base Cost	AI Cost	Total
Request + Datacenter + AI	1	+10	11 credits
Request + Residential + AI	2	+10	12 credits
Browser + Residential + AI	5	+10	15 credits
Auto + Residential + AI (request succeeds)	2	+10	12 credits
Auto + Residential + AI (upgraded to browser)	6	+10	16 credits

With Screenshot:

Configuration	Base + Screenshot	AI Cost	Total
Browser + Residential + Screenshot + AI	5 + 1	+10	16 credits

⚠️

Mode Restriction: Browser and auto modes require premium (residential) proxies. Datacenter proxies can only be used with mode="request".

Best Practices

1. Be Specific in Prompts

Bad:

{"ai_prompt": "Extract data"}

Good:

{"ai_prompt": "Extract product name, price in USD, and availability status as JSON with keys: name, price, available"}

2. Request Structured Output

Always specify the output format you want:

{"ai_prompt": "Extract contact info and return as JSON with fields: email, phone, address"}

3. Use Markdown for Text Content

For articles, blogs, and text-heavy pages:

{
  "content": "markdown",
  "ai_source": "markdown"
}

This gives AI clean, token-efficient text.

4. Use Screenshots for Visual Content

For image-heavy pages, dashboards, or UI analysis:

{
  "screenshot": true,
  "ai_source": "screenshot"
}

5. Combine with Other Features

With JavaScript automation:

{
  "mode": "browser",
  "js_instructions": [
    {"click": ".show-full-content"}
  ],
  "ai_enhance": true,
  "ai_source": "markdown"
}

With session management:

{
  "proxy_session_id": "sess123",
  "ai_enhance": true,
  "ai_source": "markdown"
}

6. Test Prompts Iteratively

Start simple and refine:

// Test 1
{"ai_prompt": "Summarize this article"}

// Test 2
{"ai_prompt": "Summarize in 3 bullet points"}

// Test 3
{"ai_prompt": "Summarize in 3 bullet points as JSON array"}

7. Monitor Token Usage

Longer content = higher processing time. For very long pages, consider extracting specific sections first.

Limitations

Content Length

Very long pages may be truncated. For optimal results, keep source content under 50,000 characters.

Language Support

Gemini 2.0 Flash supports multiple languages, but English prompts generally yield best results.

JSON Forcing

While ai_force_json=true encourages JSON output, AI may occasionally return text if it’s more appropriate for the prompt.

Processing Time

AI processing adds 2-5 seconds to request time. Factor this into timeout expectations.

Error Handling

AI Processing Failed

{
  "success": false,
  "error": "AI processing failed",
  "message": "Unable to process content with AI"
}

Solutions:

Try with a simpler prompt
Check that content source has meaningful data
Verify source (markdown vs screenshot) is appropriate

Invalid AI Source

{
  "success": false,
  "error": "Invalid parameters",
  "message": "ai_source is required when ai_enhance is true"
}

Solution: Add ai_source: markdown or ai_source: screenshot

⚠️

AI responses are generated by Google Gemini and may occasionally be inconsistent or incomplete. Always validate critical extracted data.

Network Capture Usage Examples