Forage AI Careers!

Junior Python Developer

Technology

Full Time

Remote

Experience

Some experience

Qualification

Bachelor’s degree in Computer Science/Information Technology engineering is preferred

Offered Salary

Based on experience

Posted On

31 May, 2024

Valid Till

31 July, 2024

In this role, you’ll be working with an amazingly passionate and talented team of engineers and data scientists who are working at the bleeding edge of data science and data automation.

Who are we?

As we dawn upon the age of AI, there’s nothing more crucial than the underlying data that provides the basis for intelligent and powerful software. With this in mind, Forage AI was born. We are a data automation and data science pioneer built to help democratize data. Our mission is to create powerful data assets on the fly – and we believe that by accomplishing this over and over again, in different industries and varied use cases, we can achieve incredible feats. Our suite of services includes the extraction of unstructured data from websites and documents – with a particular emphasis on extremely broad, generic and wide scale data collection – and the subsequent processing and structuring of this data using best in class approaches in ML/NLP as well as with passionate and deeply committed research teams to create top-tier datasets. Our core belief in the value of data perfection leads to an extreme level of commitment to data precision and accuracy, helping us stand apart from the rest and leading to remarkable outcomes.

Here’s what you’ll do:

Our web crawling team is very unique in the industry – while we have many “single-site” crawlers, our unique proposition and technical efforts are all geared towards building “generic” bots that can crawl and parse data from thousands of websites and documents, all using the same code. This requires a whole different level of thinking, planning, and coding. You will:

  • icon mark
    Build, improve, and run our generic robots to extract data from both the web and documents – handling critical information among a wide variety of structures and formats without error.
  • icon mark
    Craft highly scalable solutions to revolutionize our web crawling strategies.
  • icon mark
    Derive common patterns from semi-structured data, build code to handle them, and be able to deal with exceptions as well.
  • icon mark
    Be responsible for the live execution of our robots, managing turnaround times, exceptions, QA, and delivery, and building a bleeding-edge infrastructure to handle volume and scope.
  • icon mark
    Responsible for end-to-end project automation using Python.

Requirements:

  • icon mark
    Bachelor’s degree in Computer Science/Information Technology Engineering is preferred.
  • icon mark
    2-3 years of experience in web crawling using Python.
  • icon mark
    Must have expertise in scraping social media websites and a strong understanding of overcoming complex anti-crawling measures.
  • icon mark
    Must have hands-on experience in Python libraries like Requests, Scrapy, Pandas, Urllib, or BeautifulSoup (BS4).
  • icon mark
    Experience with API Development would be an added advantage.
  • icon mark
    Must have experience in working with at least one standard RDBMS (PostgreSQL, SQLServer, etc).
  • icon mark
    Must have knowledge and exposure to AWS, Docker & Lambda.
  • icon mark
    Must have created and handled fully automated End to End project pipelines using Python.
  • icon mark
    Experience with web-based automation tools (Selenium, Puppeteer, Mechanise, Render) would be an added advantage.

Other Infrastructure Requirements:

Since this is a completely work-from-home position, you will also require the following –

  • icon mark
    High-speed internet connectivity for video calls and efficient work.
  • icon mark
    Capable business-grade computer (e.g., modern processor, 8 GB+ of RAM, and no other obstacles to interrupted efficient work).
  • icon mark
    Headphones with clear audio quality.
  • icon mark
    Stable power connection and backups in case of internet/power failure.

Apply Now