Forage AI Careers!

Web Crawling Engineer

Technology

Full Time

Remote

Experience

3+ years of professional experience

Qualification

Bachelor’s degree in any field

Offered Salary

Based on experience

Posted On

11 February, 2026

Valid Till

10 March, 2026

We are seeking an experienced Web Crawling Engineer to design, build, and maintain robust data extraction systems at scale. You’ll work on developing sophisticated web scraping infrastructure that handles high-volume data collection while ensuring reliability, efficiency, and compliance.

Requirements:

Experience:

3+ years of professional experience in web scraping and data extraction

Technical Skills:

  • icon mark
    Strong proficiency in Python with extensive experience in web scraping frameworks (Scrapy, BeautifulSoup, Selenium, or similar)
  • icon mark
    Deep understanding of HTML, CSS, JavaScript, and DOM manipulation for effective data extraction
  • icon mark
    Hands-on experience with PostgreSQL for data storage and management
  • icon mark
    Proficiency with Redis for caching, queue management, and session handling
  • icon mark
    Experience with RabbitMQ for distributed task management and message queuing
  • icon mark
    Solid knowledge of AWS EC2 for deploying and managing crawling infrastructure
  • icon mark
    Proven experience implementing and managing residential and rotating proxy solutions to handle rate limiting and geo-restrictions
  • icon mark
    Understanding of anti-bot mechanisms and techniques to work within website terms of service

Core Competencies:

  • icon mark
    Ability to analyze website structures and develop efficient extraction strategies
  • icon mark
    Experience handling dynamic content, AJAX requests, and JavaScript-rendered pages
  • icon mark
    Strong debugging skills for troubleshooting scraping issues and proxy failures
  • icon mark
    Knowledge of data quality validation and cleaning techniques
  • icon mark
    Understanding of ethical scraping practices and robots.txt compliance

Responsibilities

  • icon mark
    Design and implement scalable web crawling systems using Python-based frameworks
  • icon mark
    Develop and maintain distributed scraping pipelines using RabbitMQ for task distribution
  • icon mark
    Manage proxy rotation strategies to ensure uninterrupted data collection
  • icon mark
    Optimize crawler performance and resource utilization on AWS EC2 instances
  • icon mark
    Implement data storage solutions using PostgreSQL and caching layers with Redis
  • icon mark
    Monitor crawler health, handle errors, and implement retry mechanisms
  • icon mark
    Ensure data quality through validation and normalization processes
  • icon mark
    Collaborate with data engineering and analytics teams to meet data requirements
  • icon mark
    Stay updated on changes to target websites and adapt scrapers accordingly

Other Infrastructure Requirements:

Since this is a completely work-from-home position, you will also require the following –

  • icon mark
    High-speed internet connectivity for video calls and efficient work.
  • icon mark
    Capable business-grade computer (e.g., modern processor, 8 GB+ of RAM)
  • icon mark
    Dedicated workspace (at home) for uninterrupted and efficient work.
  • icon mark
    Headphones with clear audio quality.
  • icon mark
    Stable power connection and backups in case of internet/power failure.
  • icon mark
    A Windows machine is preferred.

Nice to Have

  • icon mark
    Experience with containerization (Docker) and orchestration tools
  • icon mark
    Knowledge of additional AWS services (S3, Lambda, SQS)
  • icon mark
    Familiarity with API development and reverse engineering
  • icon mark
    Experience with cloud-based scraping services or platforms
  • icon mark
    Understanding of legal and ethical considerations in web scraping

Apply Now