Scrapy

This guide explains how to use Evomi proxies with Scrapy, a Python framework for web crawling and scraping. The integration uses a custom downloader middleware to inject proxy settings into every request.

Prerequisites

  • Python 3.8+ installed
  • A Scrapy project set up
  • Your Evomi proxy credentials (username and password)

Installation

pip install scrapy

Configuration

Step 1: Add Proxy Settings to settings.py

Open your project’s settings.py and add the Evomi proxy configuration:

# Evomi Proxy Configuration
PROXY_HOST = "rp.evomi.com"
PROXY_PORT = "1000"
PROXY_USER = "your_username"
PROXY_PASS = "your_password_session-anychars_mode-speed"
PROXY_URL = f"http://{PROXY_USER}:{PROXY_PASS}@{PROXY_HOST}:{PROXY_PORT}"

USER_AGENT = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/125.0.0.0 Safari/537.36"

Step 2: Create a Proxy Middleware

Add the following class to your project’s middlewares.py:

from scrapy.exceptions import NotConfigured

class EvomiProxyMiddleware:
    def __init__(self, proxy_url):
        self.proxy_url = proxy_url

    @classmethod
    def from_crawler(cls, crawler):
        proxy_url = crawler.settings.get("PROXY_URL")
        if not proxy_url:
            raise NotConfigured("PROXY_URL not set in settings.py")
        return cls(proxy_url)

    def process_request(self, request, spider):
        if "proxy" not in request.meta:
            request.meta["proxy"] = self.proxy_url
        return None

Scrapy’s built-in HttpProxyMiddleware reads request.meta['proxy'] and automatically extracts the credentials from the URL to create the Proxy-Authorization header.

Step 3: Enable the Middleware in settings.py

DOWNLOADER_MIDDLEWARES = {
    "your_project_name.middlewares.EvomiProxyMiddleware": 350,
}

Replace your_project_name with the actual name of your Scrapy project module.

Example Spider

Create a test spider in your spiders/ directory:

import scrapy

class IPCheckSpider(scrapy.Spider):
    name = "ipcheck"
    start_urls = ["https://ip.evomi.com/s"]

    def parse(self, response):
        self.logger.info(f"Proxy IP: {response.text.strip()}")
        yield {"ip": response.text.strip()}

Run it with:

scrapy crawl ipcheck

The output should show the Evomi proxy IP, not your real IP.

Using SOCKS5

For SOCKS5 proxies, install scrapy-socks:

pip install scrapy-socks

Then update settings.py:

PROXY_HOST = "rp.evomi.com"
PROXY_PORT = "1002"
PROXY_USER = "your_username"
PROXY_PASS = "your_password_session-anychars_mode-speed"
PROXY_URL = f"socks5h://{PROXY_USER}:{PROXY_PASS}@{PROXY_HOST}:{PROXY_PORT}"

DOWNLOAD_HANDLERS = {
    "http": "scrapy_socks.handlers.http.SOCKSDownloadHandler",
    "https": "scrapy_socks.handlers.http.SOCKSDownloadHandler",
}

The socks5h:// scheme routes DNS resolution through the proxy server.

Evomi Proxy Endpoints

Proxy Type HTTP HTTPS SOCKS5
Residential rp.evomi.com:1000 rp.evomi.com:1001 rp.evomi.com:1002
Mobile mp.evomi.com:3000 mp.evomi.com:3001 mp.evomi.com:3002
Datacenter dcp.evomi.com:2000 dcp.evomi.com:2001 dcp.evomi.com:2002

Tips and Troubleshooting

  • Credentials: Verify PROXY_USER, PROXY_PASS, PROXY_HOST, and PROXY_PORT in settings.py. Incorrect values are the most common source of errors.
  • Password format: The password your_password_session-anychars_mode-speed includes session parameters. Replace your_password with your actual password but keep the _session- and _mode- parts intact.
  • Project name: Make sure your_project_name in DOWNLOADER_MIDDLEWARES matches your actual Scrapy project module name.
  • User-Agent: Always set a realistic USER_AGENT in settings. Scrapy’s default User-Agent is frequently blocked.
  • Debug logging: Set LOG_LEVEL = "DEBUG" in settings.py for detailed request/response output.
  • Per-request proxy override: To use a different proxy for a specific request, set request.meta["proxy"] directly in the spider’s start_requests() or callback methods.
  • Dynamic content: Scrapy fetches raw HTML and does not execute JavaScript. For JS-heavy sites, consider scrapy-playwright which integrates Playwright with Scrapy.