Twitter Analysis with PageRank Algorithm

1. Introduction

This project, Twitter Analysis with PageRank Algorithm, is built using Object-Oriented Programming (OOP) principles. The system leverages Selenium and the Twitter API to collect data on Key Opinion Leaders (KOLs) on Twitter, stores the data in a PostgreSQL database, and applies the PageRank Algorithm to analyze the influence of nodes within the graph.

Key Components

Data Collection: Managed by the scraper package.
Data Processing: Import data from the output directory into the database and compute PageRank via the processor package.
Results: PageRank scores are exported to the target directory.

2. Usage

2.1. Environment Setup

Ensure Java, PostgreSQL, and Selenium WebDriver are installed on your system.
Clone the repository:

git clone /~https://github.com/username/TwitterAnalysis-with-Pagerank.git
cd TwitterAnalysis-with-Pagerank

Create an application.properties file in the root directory of the project with the following structure:

spring.datasource.url=jdbc:postgresql://localhost:5432/DatabaseName
spring.datasource.username=YourDatabaseUsername
spring.datasource.password=YourDatabasePassword

initialize_databasePath=src/main/java/processor/dataprocessing/sql/schema.sql
queriesPath=src/main/java/processor/dataprocessing/sql/queries.sql
directedSimpleGraphAdjListPath=output/AdjList/directedSimpleGraph.json
1-wayDirectedSimpleGraphAdjListPath=output/AdjList/1-wayDirectedSimpleGraph.json
PageRankOutputPath=output/PageRankPoints/pageRankPoints.json
IncrementalPageRankOutputPath=output/PageRankPoints/IncrementalPageRankPoints.json

twitter.username=YourTwitterUsername
twitter.password=YourTwitterPassword

oauth.ConsumerKey=YourConsumerKey
oauth.Consumer_Key_Secret=YourConsumerKeySecret
oauth.Access_Token=YourAccessToken
oauth.Access_Token_Secret=YourAccessTokenSecret

2.2. Data Collection

Navigate to the scraper directory:

cd src/main/java/scraper

Run the Main.java file to start data collection:

javac Main.java
java Main

The collected data will be saved in the output directory:

Graph adjacency lists: Located in output/AdjList.
Raw data: Located in output/Data.

2.3. Data Processing and PageRank Calculation

Navigate to the processor directory:

cd src/main/java/processor

Run the Main.java file to process data and calculate PageRank:

javac Main.java
java Main

The Main.java file performs the following tasks:

Initializes the database using schema.sql.
Imports data from the output directory into the PostgreSQL database.
Computes PageRank scores and exports the results to output/PageRankPoints.

2.4. Results

PageRank scores of graph nodes are saved in the output/PageRankPoints directory:
pageRankPoints.json: General PageRank results.
IncrementalPageRankPoints.json: Incremental PageRank results.

3. Report

Detailed package design and the overall process are documented in the report file located in report/OOP_Report.pdf.

4. Technologies Used

Programming Language: Java
Framework: Spring Boot
Library: Selenium WebDriver
Database: PostgreSQL
Algorithm: PageRank

5. Contributing

If you’d like to contribute to this project:

Fork the repository.
Create a feature branch:

git checkout -b feature-branch

Commit your changes:

git commit -m “Add new feature”

Push the branch:

git push origin feature-branch

Open a Pull Request.

For any additional queries, feel free to open an issue in the repository. 😊

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
Twitter_Pagerank		Twitter_Pagerank
report		report
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Twitter Analysis with PageRank Algorithm

1. Introduction

Key Components

2. Usage

2.1. Environment Setup

2.2. Data Collection

2.3. Data Processing and PageRank Calculation

2.4. Results

3. Report

4. Technologies Used

5. Contributing

About

Releases

Packages

Languages

fuongfotfet/TwitterAnalysis-with-Pagerank

Folders and files

Latest commit

History

Repository files navigation

Twitter Analysis with PageRank Algorithm

1. Introduction

Key Components

2. Usage

2.1. Environment Setup

2.2. Data Collection

2.3. Data Processing and PageRank Calculation

2.4. Results

3. Report

4. Technologies Used

5. Contributing

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages