Result Polling
When scraping tasks take longer than the timeout limit or when you submit async requests, you’ll need to poll for results. This guide covers how to retrieve results from background tasks.
When Polling Is Needed
Polling is required in these scenarios:
- Explicit async requests — You set
async=truein your request - Request timeouts — A synchronous request exceeds the timeout limit and auto-converts to async
Timeout Handling
Different modes have different timeout limits:
| Mode | Timeout | What Happens After |
|---|---|---|
request |
30 seconds | Auto-converts to async task |
browser |
45 seconds | Auto-converts to async task |
auto |
30-45 seconds | Depends on mode used |
Timeout Response
When a synchronous request times out, you receive a 202 Accepted response:
{
"success": true,
"task_id": "task_abc123",
"status": "processing",
"message": "Task is taking longer than expected. Use task_id to check status.",
"check_url": "/api/v1/scraper/tasks/task_abc123"
}Submitting Async Requests
For long-running or batch jobs, submit async requests from the start:
Request
curl -X POST "https://scrape.evomi.com/api/v1/scraper/realtime" \
-H "x-api-key: YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"url": "https://example.com",
"async": true
}'Response
{
"task_id": "task_abc123",
"status": "processing",
"check_url": "/api/v1/scraper/tasks/task_abc123"
}Checking Task Status
Request
curl "https://scrape.evomi.com/api/v1/scraper/tasks/task_abc123?api_key=YOUR_API_KEY"Response (Processing)
{
"task_id": "task_abc123",
"status": "processing",
"created_at": "2025-01-15T10:30:00Z",
"elapsed_seconds": 12
}Response (Completed)
{
"task_id": "task_abc123",
"status": "completed",
"success": true,
"url": "https://example.com",
"domain": "example.com",
"title": "Example Domain",
"content": "<!DOCTYPE html>...",
"status_code": 200,
"credits_used": 1.0,
"credits_remaining": 99.0,
"mode_used": "auto (request)"
}Response (Failed)
{
"task_id": "task_abc123",
"status": "failed",
"success": false,
"error": "Connection timeout after 30 seconds",
"credits_used": 0.5
}Polling for Results
Python Example
import time
import requests
def wait_for_task(task_id, api_key, max_wait=120, poll_interval=2):
"""Poll task status until completion"""
url = f"https://scrape.evomi.com/api/v1/scraper/tasks/{task_id}"
headers = {"x-api-key": api_key}
start_time = time.time()
while time.time() - start_time < max_wait:
response = requests.get(url, headers=headers)
data = response.json()
status = data.get("status")
if status == "completed":
return data
elif status == "failed":
raise Exception(f"Task failed: {data.get('error')}")
# Still processing
time.sleep(poll_interval)
raise TimeoutError(f"Task {task_id} did not complete in {max_wait} seconds")
# Usage
result = wait_for_task("task_abc123", api_key)
print(result["content"])JavaScript Example
async function waitForTask(taskId, apiKey, maxWait = 120000, pollInterval = 2000) {
const url = `https://scrape.evomi.com/api/v1/scraper/tasks/${taskId}`;
const headers = { 'x-api-key': apiKey };
const startTime = Date.now();
while (Date.now() - startTime < maxWait) {
const response = await fetch(url, { headers });
const data = await response.json();
if (data.status === 'completed') {
return data;
} else if (data.status === 'failed') {
throw new Error(`Task failed: ${data.error}`);
}
await new Promise(resolve => setTimeout(resolve, pollInterval));
}
throw new Error(`Task ${taskId} did not complete in time`);
}
// Usage
const result = await waitForTask('task_abc123', apiKey);
console.log(result.content);Go Example
func WaitForTask(taskID, apiKey string, maxWait time.Duration) (map[string]interface{}, error) {
url := fmt.Sprintf("https://scrape.evomi.com/api/v1/scraper/tasks/%s?api_key=%s", taskID, apiKey)
startTime := time.Now()
pollInterval := 2 * time.Second
for time.Since(startTime) < maxWait {
resp, err := http.Get(url)
if err != nil {
return nil, err
}
var data map[string]interface{}
json.NewDecoder(resp.Body).Decode(&data)
resp.Body.Close()
status := data["status"].(string)
if status == "completed" {
return data, nil
} else if status == "failed" {
return nil, fmt.Errorf("task failed: %v", data["error"])
}
time.Sleep(pollInterval)
}
return nil, fmt.Errorf("task did not complete in %v", maxWait)
}Raw vs JSON Responses
When polling for results, the response format depends on the delivery parameter you set in your original request.
Raw Delivery (Default)
If you used delivery=raw (or didn’t specify delivery), the polled result returns the raw content directly:
curl "https://scrape.evomi.com/api/v1/scraper/tasks/task_abc123?api_key=YOUR_API_KEY"Response:
Content-Type: text/html; charset=utf-8
X-Credits-Used: 1.0
X-Credits-Remaining: 99.0
<!DOCTYPE html>
<html>
...JSON Delivery
If you used delivery=json, the polled result returns a structured JSON response with metadata:
curl "https://scrape.evomi.com/api/v1/scraper/tasks/task_abc123?api_key=YOUR_API_KEY"Response:
{
"task_id": "task_abc123",
"status": "completed",
"success": true,
"url": "https://example.com",
"title": "Example Domain",
"status_code": 200,
"credits_used": 1.0
}include_content=true in your original request to receive the scraped content. Without this parameter, only metadata is returned—no HTML, Markdown, or other content.Using Delivery and Include_content
When using JSON delivery mode, you must set include_content=true to receive the scraped content in the response.
Without include_content
curl "https://scrape.evomi.com/api/v1/scraper/realtime?url=https://example.com&delivery=json&api_key=YOUR_API_KEY"Response omits content:
{
"success": true,
"url": "https://example.com",
"title": "Example Domain",
"status_code": 200,
"credits_used": 1.0,
"hints": ["Content omitted by default. Set include_content=true to include it."]
}With include_content
curl "https://scrape.evomi.com/api/v1/scraper/realtime?url=https://example.com&delivery=json&include_content=true&api_key=YOUR_API_KEY"Response includes content:
{
"success": true,
"url": "https://example.com",
"title": "Example Domain",
"content": "<!DOCTYPE html>...",
"status_code": 200,
"credits_used": 1.0
}delivery=json, always set include_content=true if you need the scraped content (HTML, Markdown, etc.) in the response. Without this parameter, only metadata is returned to save bandwidth.Batch Processing with Async
For large batches, submit all tasks first, then poll for results:
import asyncio
import aiohttp
async def submit_and_wait(urls, api_key):
async with aiohttp.ClientSession() as session:
# Submit all tasks
task_ids = []
for url in urls:
async with session.post(
"https://scrape.evomi.com/api/v1/scraper/realtime",
headers={"x-api-key": api_key},
json={"url": url, "async": True}
) as resp:
data = await resp.json()
task_ids.append(data["task_id"])
# Poll for all results
results = []
for task_id in task_ids:
result = await poll_task(session, task_id, api_key)
results.append(result)
return results
async def poll_task(session, task_id, api_key, max_wait=120):
url = f"https://scrape.evomi.com/api/v1/scraper/tasks/{task_id}"
headers = {"x-api-key": api_key}
start = asyncio.get_event_loop().time()
while asyncio.get_event_loop().time() - start < max_wait:
async with session.get(url, headers=headers) as resp:
data = await resp.json()
if data["status"] == "completed":
return data
elif data["status"] == "failed":
raise Exception(f"Task failed: {data.get('error')}")
await asyncio.sleep(2)
raise TimeoutError(f"Task {task_id} timed out")
# Usage
urls = ["https://example1.com", "https://example2.com", "https://example3.com"]
results = asyncio.run(submit_and_wait(urls, api_key))Webhook Alternative
Instead of polling, you can use webhooks to receive notifications when tasks complete. See Webhooks for details.
{
"url": "https://example.com",
"async": true,
"webhook": {
"url": "https://your-server.com/webhook",
"webhook_type": "custom",
"events": ["completed", "failed"]
}
}Best Practices
- Use appropriate poll intervals — Poll every 2-3 seconds, not faster
- Set reasonable timeouts — Most tasks complete within 60 seconds
- Handle failures gracefully — Check the
errorfield in failed responses - Retrieve results promptly — Results expire after 10 minutes
- Use webhooks for batches — Avoid polling hundreds of tasks simultaneously