Skip to content

Crawl, scrape and persist Mobile.de car listings data in a smart & responsible way

License

Notifications You must be signed in to change notification settings

robertciotoiu/mobile-de-car-data-collector

Repository files navigation

📊⛏ Mobile.de Car Data Scraper

CI Maven Build & Sonar

Quality Gate Status Coverage Technical Debt Lines of Code

Bugs Vulnerabilities Duplicated Lines (%) Reliability Rating Maintainability Rating Security Rating Code Smells

Mobile.de Car Data Scraper is a responsible and ethical data scraping project that retrieves car listing data from Mobile.de. This project enforces delays between requests to avoid overloading the website's servers.

The project is written in Java 19 and makes use of the following technologies:

  • Spring Boot
  • Maven
  • Log4j2
  • JUnit 5
  • Docker and Kubernetes

Getting Started

These instructions will get you a copy of the project up and running on your local machine for development and testing purposes.

Prerequisites

You will need to have the following software installed on your machine:

  • Java 19
  • Docker and Kubernetes (optional, only if you want to deploy the project in a containerized environment)

Running the application

  1. Clone the repository
git clone /~https://github.com/robertciotoiu/mobile-de-scraper.git
  1. Set a valid localPath to point to a location on your disk where the MongoDB will persist

  2. Navigate to the project directory

cd mobile-de-car-data-collector/infrastructure
  1. Build docker & push images and deploy all pods to a K8s namespace
./deploy.sh

Docker images will be built and pushed to the local docker image repository. Then it will create a namespace named "rc"(can be changed) and the K8s resources. Each pod will automatically start to crawl, parse and save car data listings into a MongoDB that runs in its own pod but persists the data locally on the disk.

Running the tests

To run the JUnit tests, execute the following command in the project directory:

mvn test

Deployment

If you want to deploy the project in a containerized environment, you can use Docker and Kubernetes.

Built With

  • Java 19
  • Spring Boot
  • Maven
  • Log4j2
  • JUnit 5
  • Docker and Kubernetes

Contributing

Please read CONTRIBUTING.md for details on our code of conduct, and the process for submitting pull requests.

Authors

See also the list of contributors who participated in this project.

License

This project is licensed under the MIT License - see the LICENSE.md file for details.