Skip to content

An OOP-based project leveraging the PageRank algorithm to analyze and rank Key Opinion Leaders (KOLs) on Twitter based on their influence.

Notifications You must be signed in to change notification settings

fuongfotfet/TwitterAnalysis-with-Pagerank

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 

Repository files navigation

Twitter Analysis with PageRank Algorithm

1. Introduction

This project, Twitter Analysis with PageRank Algorithm, is built using Object-Oriented Programming (OOP) principles. The system leverages Selenium and the Twitter API to collect data on Key Opinion Leaders (KOLs) on Twitter, stores the data in a PostgreSQL database, and applies the PageRank Algorithm to analyze the influence of nodes within the graph.

Key Components

  1. Data Collection: Managed by the scraper package.
  2. Data Processing: Import data from the output directory into the database and compute PageRank via the processor package.
  3. Results: PageRank scores are exported to the target directory.

2. Usage

2.1. Environment Setup

  • Ensure Java, PostgreSQL, and Selenium WebDriver are installed on your system.
  • Clone the repository:
git clone /~https://github.com/username/TwitterAnalysis-with-Pagerank.git
cd TwitterAnalysis-with-Pagerank
  • Create an application.properties file in the root directory of the project with the following structure:
spring.datasource.url=jdbc:postgresql://localhost:5432/DatabaseName
spring.datasource.username=YourDatabaseUsername
spring.datasource.password=YourDatabasePassword

initialize_databasePath=src/main/java/processor/dataprocessing/sql/schema.sql
queriesPath=src/main/java/processor/dataprocessing/sql/queries.sql
directedSimpleGraphAdjListPath=output/AdjList/directedSimpleGraph.json
1-wayDirectedSimpleGraphAdjListPath=output/AdjList/1-wayDirectedSimpleGraph.json
PageRankOutputPath=output/PageRankPoints/pageRankPoints.json
IncrementalPageRankOutputPath=output/PageRankPoints/IncrementalPageRankPoints.json

twitter.username=YourTwitterUsername
twitter.password=YourTwitterPassword

oauth.ConsumerKey=YourConsumerKey
oauth.Consumer_Key_Secret=YourConsumerKeySecret
oauth.Access_Token=YourAccessToken
oauth.Access_Token_Secret=YourAccessTokenSecret

2.2. Data Collection

  1. Navigate to the scraper directory:
cd src/main/java/scraper
  1. Run the Main.java file to start data collection:
javac Main.java
java Main
  1. The collected data will be saved in the output directory:
  • Graph adjacency lists: Located in output/AdjList.
  • Raw data: Located in output/Data. Scraped Data Example

2.3. Data Processing and PageRank Calculation

  1. Navigate to the processor directory:
cd src/main/java/processor
  1. Run the Main.java file to process data and calculate PageRank:
javac Main.java
java Main
  1. The Main.java file performs the following tasks:
  • Initializes the database using schema.sql.
  • Imports data from the output directory into the PostgreSQL database.
  • Computes PageRank scores and exports the results to output/PageRankPoints. Pagerank Score

2.4. Results

  • PageRank scores of graph nodes are saved in the output/PageRankPoints directory:
  • pageRankPoints.json: General PageRank results.
  • IncrementalPageRankPoints.json: Incremental PageRank results.

Visualization

3. Report

Detailed package design and the overall process are documented in the report file located in report/OOP_Report.pdf.


4. Technologies Used

  • Programming Language: Java
  • Framework: Spring Boot
  • Library: Selenium WebDriver
  • Database: PostgreSQL
  • Algorithm: PageRank

5. Contributing

If you’d like to contribute to this project:

  1. Fork the repository.
  2. Create a feature branch:
git checkout -b feature-branch
  1. Commit your changes:
git commit -m “Add new feature”
  1. Push the branch:
git push origin feature-branch
  1. Open a Pull Request.

For any additional queries, feel free to open an issue in the repository. 😊

About

An OOP-based project leveraging the PageRank algorithm to analyze and rank Key Opinion Leaders (KOLs) on Twitter based on their influence.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published