Forage AI Careers!

Software Engineer

Technology

Full Time

Remote

Experience

3–4 years as a Software Engineer.

Qualification

Bachelor’s degree in any field

Offered Salary

Based on experience

Posted On

14 February, 2026

Valid Till

14 March, 2026

About Forage AI

Forage AI builds next-generation systems for large-scale data collection and processing — including web crawling, document parsing, enrichment pipelines, and automation.
We primarily work in Python, design cloud-native systems (AWS-first, with exposure to GCP/Azure), and increasingly integrate GenAI and agent-based workflows into our stack.

Our engineering culture emphasizes ownership, clarity, and reliability. Every developer owns their module end-to-end and collaborates closely in a high-trust, high-impact environment.

Role Overview

This role offers a unique blend of advanced web scraping and GenAI/LLM-driven capabilities to build intelligent, scalable data collection systems and pipelines. You will work on designing and operating generic, reusable scrapers, while also leveraging GenAI and AI agents to enhance extraction, enrichment, validation, and automation workflows.

Beyond building crawlers, you will use GenAI to help create stable, production-grade data products and services aligned with real-world business use cases — transforming raw web and document data into reliable, market-ready solutions. The work sits at the intersection of deep technical problem-solving and market orientation, in a fast-moving, high-tech, and cutting-edge engineering environment.

The role is highly hands-on and emphasizes clean design, reliability, scalability, and impact, with strong ownership from design through deployment and operation.

Key Responsibilities

  • icon mark
    Develop and maintain Python-based systems for large-scale crawling, parsing, enrichment, and processing of structured and unstructured data.
  • icon mark
    Build generic, reusable crawling crawlers capable of extracting data from thousands of websites and documents using shared, configurable codebases.
  • icon mark
    Design and implement GenAI-assisted data extraction and enrichment workflows, including:
    – Using LLMs to interpret semi-structured or unstructured content (HTML, PDFs, text-heavy pages).
    – Applying prompt-driven logic for classification, normalization, entity extraction, and validation.
  • icon mark
    Implement RAG (Retrieval-Augmented Generation) patterns by combining crawled data with vector databases to improve accuracy, consistency, and explainability.
  • icon mark
    Integrate AI agents into data pipelines to autonomously navigate websites, understand page context, select the correct interaction paths, and extract high-value data from dynamic or evolving layouts.
  • icon mark
    Handle complex anti-crawling challenges, including IP rotation, retries, throttling, headers, fingerprinting, and bot-detection mechanisms.
  • icon mark
    Derive common patterns from semi-structured data, build resilient parsing logic, and gracefully manage edge cases and failures.
  • icon mark
    Build and operate end-to-end automated pipelines (crawl → process → enrich → validate → store → deliver), including AI-powered enrichment stages.
  • icon mark
    Design and maintain ETL/ELT workflows with strong validation, monitoring, error-handling, and auditability.
  • icon mark
    Work with SQL, NoSQL, and vector databases, contributing to data modeling, storage, and retrieval strategies.
  • icon mark
    Implement and consume APIs and microservices, including services that expose AI-powered enrichment or extraction capabilities.
  • icon mark
    Contribute to cloud-native system design on AWS (S3, Lambda, ECS/EKS, SQS/SNS, RDS/DynamoDB, CloudWatch).
  • icon mark
    Own live execution of crawlers and pipelines, managing turnaround times, exceptions, QA checks, and delivery SLAs.
  • icon mark
    Write unit and integration tests, debug production issues, profile performance, and participate in code reviews.
  • icon mark
    Implement observability (logging, metrics, tracing) and follow security best practices (secrets management, IAM, least privilege).
  • icon mark
    Collaborate closely with Dev, QA, and Ops teams; ship incrementally using small PRs, design docs, and measurable outcomes.

Required Qualifications

  • icon mark
    3–4 years of professional experience as a Software Engineer.
  • icon mark
    Strong proficiency in Python, with solid understanding of data structures, algorithms, and clean software design
  • icon mark
    Hands-on experience with web crawling and scraping, including:
    – Requests, Scrapy, BeautifulSoup (BS4), Pandas, Urllib
    – Selenium / Playwright (or similar browser automation tools)
  • icon mark
    Proven experience scraping large-scale or complex websites, including social media platforms.
  • icon mark
    Strong understanding of anti-bot measures and resilient crawling strategies.
  • icon mark
    Working knowledge of SQL and experience with at least one RDBMS (PostgreSQL, SQL Server, etc.).
  • icon mark
    Exposure to AWS services and cloud-native concepts.
  • icon mark
    Comfortable working on Linux and using Git for version control.
  • icon mark
    Practical understanding of system design and distributed systems basics.

Preferred / Good to Have (Prioritized)

  • icon mark
    Containers & CI/CD:
    – Docker, GitHub Actions / Jenkins
    – Basic exposure to Kubernetes
  • icon mark
    Data Infrastructure:
    – Airflow, Spark, Kafka, or large-scale ETL systems
  • icon mark
    Infrastructure as Code:
    – Terraform or CloudFormation
    – Basic cloud cost and performance optimization
  • icon mark
    Frontend / JavaScript:
    – Basic familiarity is a nice-to-have
  • icon mark
    Exposure to GCP or Azure

How We Work

  • icon mark
    Strong ownership: design → build → deploy → operate.
  • icon mark
    Pragmatic engineering with small PRs and incremental delivery.
  • icon mark
    Emphasis on clear communication, documentation, and reliability.
  • icon mark
    Engineering decisions guided by scale, cost, and long-term maintainability.

Work-from-Home Requirements

  • icon mark
    Reliable high-speed internet for calls and collaboration.
  • icon mark
    A capable computer (modern CPU, 8GB+ RAM).
  • icon mark
    Headphones with clear audio quality.
  • icon mark
    Stable power and backup arrangements.

Forage AI is an equal-opportunity employer.
We value curiosity, craftsmanship, and collaboration, and we look for engineers who enjoy solving hard problems at scale.

Apply Now