Forage AI Careers!
Software Engineer
Technology
Full Time
Remote
About Forage AI
ForageAI builds next‑generation systems for data collection and processing — large‑scale web crawling, document parsing, data pipelines, and automation. We work primarily in Python, leverage cloud‑native designs (mainly AWS, with exposure to GCP/Azure), and increasingly apply GenAI and AI agents across our stack. Every developer owns their module and collaborates closely with peers in a high‑ownership, high‑trust environment.
Role Overview:
You will design, build, and operate software for data collection and processing at scale. The role is hands‑on, with emphasis on clean design, reliability, and performance.
Key Responsibilities:
-
Develop and maintain Python applications for crawling, parsing, enrichment, and processing of large datasets.
-
Build and operate data workflows (ETL/ELT), including validation, monitoring, and error‑handling.
-
Work with SQL and NoSQL (plus vector databases/data lakes) for modeling, storage, and retrieval.
-
Contribute to system design using cloud‑native components on AWS (e.g., S3, Lambda, ECS/EKS, SQS/SNS, RDS/DynamoDB, CloudWatch).
-
Implement and consume APIs/microservices; write clear contracts and documentation.
-
Write unit/integration tests, perform debugging and profiling; contribute to code reviews and maintain high code quality.
-
Implement observability (logging/metrics/tracing) and basic security practices (secrets, IAM, least privilege).
-
Collaborate with Dev/QA/Ops; ship incrementally using PRs and design docs.
Required Qualifications
-
2–4 years of professional software engineering experience.
-
Strong proficiency in Python; good knowledge of data structures/algorithms and software design principles.
-
Hands‑on with SQL and at least one NoSQL store; familiarity with vector databases is a plus.
-
Experience with web scraping frameworks (e.g., Scrapy, Selenium/Playwright, BeautifulSoup) and resilient crawling patterns (respect robots/rotations/retries).
-
Practical understanding of system design and distributed systems basics.
-
Exposure to AWS services and cloud‑native design; comfortable on Linux and with Git.
Preferred / Good to Have (Prioritized)
-
GenAI & LLMs: experience with LangChain, CrewAI, LlamaIndex, prompt design, RAG patterns, and vector stores. (Candidates with this experience will be prioritized.)
-
CI/CD & Containers: exposure to pipelines (GitHub Actions/Jenkins), Docker, and Kubernetes.
-
Data Pipelines/Big Data: ETL/ELT, Airflow, Spark, Kafka, or similar.
-
Infra as Code: Terraform/CloudFormation; basic cost‑ and performance‑optimization on cloud.
-
Frontend/JS: not required; basic JS or frontend skills are a nice‑to‑have only.
-
Exposure to GCP/Azure.
How We Work
-
Ownership of modules end‑to‑end (design → build → deploy → operate).
-
Clear communication, collaborative problem‑solving, and documentation.
-
Pragmatic engineering: small PRs, incremental delivery, and measurable reliability.
Work‑from‑Home Requirements
-
High‑speed internet for calls and collaboration.
-
A capable, reliable computer (modern CPU, 8GB+ RAM).
-
Headphones with clear audio quality.
-
Stable power and backup arrangements.
ForageAI is an equal‑opportunity employer. We value curiosity, craftsmanship, and collaboration.
Apply Now