AI Enhancement

Transform raw web content into structured insights with built-in AI processing. Extract summaries, generate structured data, and analyze content—all in a single API call.

AI Model: Google Gemini 2.0 Flash
Cost: +10 credits (additive to base cost)

Quick Start

Enable AI processing by setting ai_enhance=true and specifying a source:

curl -X POST "https://scrape.evomi.com/api/v1/scraper/realtime" \
  -H "x-api-key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://news.example.com/article",
    "ai_enhance": true,
    "ai_source": "markdown",
    "ai_prompt": "Extract headline, author, date, and create a 2-sentence summary as JSON"
  }'

Parameters

Parameter Required Default Description
ai_enhance Yes false Enable AI processing
ai_source Yes* - Source for AI: markdown or screenshot
ai_prompt No - Custom instructions for AI
ai_force_json No true Force JSON output format

*Required when ai_enhance=true

AI Sources

Markdown Source

Process the text content of the page.

Value: markdown

{
  "url": "https://blog.example.com/post",
  "ai_enhance": true,
  "ai_source": "markdown"
}

Best for:

  • Article summarization
  • Data extraction from text
  • Content classification
  • Information extraction
  • Structured data generation
  • Q&A from documentation

Characteristics:

  • Processes clean text (no HTML boilerplate)
  • Token-efficient
  • Fast processing
  • Ideal for text-heavy content

Screenshot Source

Process the visual appearance of the page using vision AI.

Value: screenshot

Requirements:

  • Must set screenshot=true
  • Requires mode=browser or mode=auto
{
  "url": "https://example.com",
  "mode": "browser",
  "screenshot": true,
  "ai_enhance": true,
  "ai_source": "screenshot"
}

Best for:

  • Layout analysis
  • Visual content extraction
  • Image-based content
  • UI element detection
  • Design analysis
  • Charts and graphs

Characteristics:

  • Processes visual representation
  • Can extract text from images
  • Understands layout and design
  • Higher processing cost

Custom Prompts

Guide AI processing with custom instructions via ai_prompt:

Structured Data Extraction

{
  "ai_prompt": "Extract the article headline, author, publication date, and create a 2-sentence summary. Return as JSON with keys: headline, author, date, summary."
}

Example Response:

{
  "ai_response": {
    "headline": "The Future of Web Scraping",
    "author": "Jane Doe",
    "date": "2025-01-15",
    "summary": "Web scraping is evolving with AI integration. Modern APIs now offer built-in intelligence for data extraction."
  }
}

Bullet Point Summary

{
  "ai_prompt": "Summarize the main points in 5 concise bullet points."
}

Example Response:

{
  "ai_response": {
    "summary": [
      "Web scraping is becoming more intelligent",
      "AI helps extract structured data automatically",
      "Modern APIs combine scraping and AI processing",
      "Costs are predictable with transparent pricing",
      "Integration is simple with REST APIs"
    ]
  }
}

Contact Information

{
  "ai_prompt": "Extract all contact information including email, phone, address, and social media links. Return as JSON."
}

Product Data

{
  "ai_prompt": "Extract all product names, prices, and availability status. Return as a JSON array with fields: name, price, available."
}

Visual Analysis (Screenshot Source)

{
  "ai_source": "screenshot",
  "ai_prompt": "Describe the page layout, identify main sections, and list all call-to-action buttons with their text and position."
}

Default Behavior

Without a custom prompt, AI provides intelligent defaults based on source:

Default for Markdown

{
  "title": "Page Title",
  "description": "Brief description",
  "summary": "Concise content summary",
  "key_points": [
    "Main point 1",
    "Main point 2",
    "Main point 3"
  ]
}

Default for Screenshot

{
  "layout_description": "Visual layout description",
  "visible_text": ["Text element 1", "Text element 2"],
  "ui_elements": ["Button", "Form", "Navigation"],
  "call_to_action": ["Sign Up", "Learn More"]
}

Response Formats

With delivery=raw (Default)

Returns ONLY the AI response:

curl -X POST "https://scrape.evomi.com/api/v1/scraper/realtime" \
  -H "x-api-key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com",
    "ai_enhance": true,
    "ai_source": "markdown",
    "delivery": "raw"
  }'

Response:

{
  "title": "Example Domain",
  "summary": "This domain is for use in illustrative examples...",
  "key_points": ["Used for examples", "No prior coordination needed"]
}

Headers:

Content-Type: application/json; charset=utf-8
X-Response-Source: ai-raw-mode
X-Credits-Used: 15.0

With delivery=json

Returns AI response in ai_response field:

curl -X POST "https://scrape.evomi.com/api/v1/scraper/realtime" \
  -H "x-api-key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com",
    "ai_enhance": true,
    "ai_source": "markdown",
    "delivery": "json"
  }'

Response:

{
  "success": true,
  "url": "https://example.com",
  "ai_response": {
    "title": "Example Domain",
    "summary": "This domain is for use in illustrative examples...",
    "key_points": ["Used for examples", "No prior coordination needed"]
  },
  "credits_used": 15.0,
  "credits_remaining": 85.0,
  "mode_used": "browser"
}

Force JSON Output

By default, ai_force_json=true instructs AI to return valid JSON.

For free-form text responses:

{
  "ai_enhance": true,
  "ai_source": "markdown",
  "ai_force_json": false,
  "ai_prompt": "Write a creative short story inspired by this content."
}

Use Cases

1. News Article Extraction

{
  "url": "https://news.example.com/article/123",
  "ai_enhance": true,
  "ai_source": "markdown",
  "ai_prompt": "Extract: headline, author, date, category, tags, and 2-sentence summary. Return as JSON."
}

2. E-commerce Product Scraping

{
  "url": "https://shop.example.com/product/456",
  "ai_enhance": true,
  "ai_source": "markdown",
  "ai_prompt": "Extract product name, price, description, specifications, and availability. Return as structured JSON."
}

3. Job Posting Analysis

{
  "url": "https://jobs.example.com/posting/789",
  "ai_enhance": true,
  "ai_source": "markdown",
  "ai_prompt": "Extract: job title, company, location, salary range, requirements, and responsibilities as JSON."
}

4. Business Directory

{
  "url": "https://directory.example.com/business/abc",
  "ai_enhance": true,
  "ai_source": "markdown",
  "ai_prompt": "Extract business name, address, phone, email, website, hours, and services as JSON."
}

5. Documentation Summary

{
  "url": "https://docs.example.com/guide",
  "ai_enhance": true,
  "ai_source": "markdown",
  "ai_prompt": "Create a concise summary of this documentation page with 5 key takeaways."
}

6. Visual Layout Analysis

{
  "url": "https://example.com",
  "mode": "browser",
  "screenshot": true,
  "ai_enhance": true,
  "ai_source": "screenshot",
  "ai_prompt": "Analyze the page layout. Identify header, navigation, main content area, sidebar, and footer. List all visible call-to-action buttons."
}

Cost Calculation

AI enhancement adds 10 credits to your base cost:

Configuration Calculation Total
Request + Datacenter + AI (1 × 1 × 1) + 10 11 credits
Request + Residential + AI (1 × 1 × 5) + 10 15 credits
Browser + Datacenter + AI (1 × 5 × 1) + 10 15 credits
Browser + Residential + AI (1 × 5 × 5) + 10 35 credits
Auto + Datacenter + AI (no browser) 1 + 10 11 credits
Auto + Datacenter + AI (with browser) 5.5 + 10 15.5 credits

With Screenshot:

Configuration Calculation Total
Browser + DC + Screenshot + AI 5 + 1 + 10 16 credits
Browser + Residential + Screenshot + AI 25 + 1 + 10 36 credits

Best Practices

1. Be Specific in Prompts

Bad:

{"ai_prompt": "Extract data"}

Good:

{"ai_prompt": "Extract product name, price in USD, and availability status as JSON with keys: name, price, available"}

2. Request Structured Output

Always specify the output format you want:

{"ai_prompt": "Extract contact info and return as JSON with fields: email, phone, address"}

3. Use Markdown for Text Content

For articles, blogs, and text-heavy pages:

{
  "content": "markdown",
  "ai_source": "markdown"
}

This gives AI clean, token-efficient text.

4. Use Screenshots for Visual Content

For image-heavy pages, dashboards, or UI analysis:

{
  "screenshot": true,
  "ai_source": "screenshot"
}

5. Combine with Other Features

With JavaScript automation:

{
  "mode": "browser",
  "js_instructions": [
    {"click": ".show-full-content"}
  ],
  "ai_enhance": true,
  "ai_source": "markdown"
}

With session management:

{
  "proxy_session_id": "sess123",
  "ai_enhance": true,
  "ai_source": "markdown"
}

6. Test Prompts Iteratively

Start simple and refine:

// Test 1
{"ai_prompt": "Summarize this article"}

// Test 2
{"ai_prompt": "Summarize in 3 bullet points"}

// Test 3
{"ai_prompt": "Summarize in 3 bullet points as JSON array"}

7. Monitor Token Usage

Longer content = higher processing time. For very long pages, consider extracting specific sections first.

Limitations

Content Length

Very long pages may be truncated. For optimal results, keep source content under 50,000 characters.

Language Support

Gemini 2.0 Flash supports multiple languages, but English prompts generally yield best results.

JSON Forcing

While ai_force_json=true encourages JSON output, AI may occasionally return text if it’s more appropriate for the prompt.

Processing Time

AI processing adds 2-5 seconds to request time. Factor this into timeout expectations.

Error Handling

AI Processing Failed

{
  "success": false,
  "error": "AI processing failed",
  "message": "Unable to process content with AI"
}

Solutions:

  • Try with a simpler prompt
  • Check that content source has meaningful data
  • Verify source (markdown vs screenshot) is appropriate

Invalid AI Source

{
  "success": false,
  "error": "Invalid parameters",
  "message": "ai_source is required when ai_enhance is true"
}

Solution: Add ai_source: markdown or ai_source: screenshot

⚠️
AI responses are generated by Google Gemini and may occasionally be inconsistent or incomplete. Always validate critical extracted data.