Skip to content

Latest commit

 

History

History
72 lines (51 loc) · 1.95 KB

README.md

File metadata and controls

72 lines (51 loc) · 1.95 KB

Google Maps Scraper

  • This project is a Google Maps scraper built using Python and Playwright. It consists of two main parts: the scraper and the server.
  • it will scrap all liastings that represented in a scrollbar after enter search query (after scrolling to the end of scrollbar).

Features

  • Server: Provides two API endpoints:
    • /status: Check the status of the application.
    • /request: Accepts a JSON payload to enqueue search queries.
  • Scraper: Listens to a Redis queue for search queries, opens multiple tabs with different browsers and user agents, and scrapes Google Maps.

API Details

/request Endpoint

Accepts a JSON payload with the following structure:

{
  "city": "",
  "listing_category": "",
  "listing_type": "",
  "province": "",
  "verb": ""
}

The scraper combines these fields (listing_type + verb + city + province) to create a search query and enqueues it for processing. use all fields to create excel file for storing search result.

/status Endpoint

Returns the current status of the server.

How It Works

  1. The server receives requests via the /request endpoint and enqueues the search queries into a Redis queue.
  2. The scraper listens to the Redis queue, dequeues search queries, and opens multiple tabs with different browser instances and user agents.
  3. The scraper processes the search query by interacting with Google Maps.

Requirements

  • Python 3.12
  • Poetry for dependency management
  • Redis server
  • Playwright

Setup and Installation

  1. Clone the repository:
   git clone git@github.com:AmirEspahbodi/google-map-scraper.git
   cd google-map-scraper
  1. Install dependencies:
   poetry install
  1. Install Playwright browsers:
   poetry run playwright install
   poetry run playwright install-deps
  1. Start the application:
   poetry run python run_app.py