This is a simple web crawler that crawls link. Parses through the results page. It works based on the (Scrapy)[https://scrapy.org/] crawling engine. Its uses Extruct to parse application/ld+json content of the pages to retrieve basic content and Xpath to query the
pip install -r requirements.txt
cd <directory>
scrapy crawl wizard
The mongodb collection schema is as follows
event_name
description
age_group
location
price
link
event_link
date
The mongodb database is mommy
and the collection is crawl
To view the crawled data run the below commands at the mongo shell
> use mommy
> db.crawl.find()