Influencers Classification Project

Introduction

This project aims to classify social media influencers into different categories based on various their posts. The goal is to provide a tool for identifying key influencers in specific areas for marketing and collaboration purposes. It's part of our end of year project at INSAT.

Project Description

Influencer marketing has become a critical strategy for brands to reach their target audiences. However, finding the right influencers who align with a brand’s values and target demographics can be challenging. This project leverages machine learning techniques to classify influencers into distinct categories to aid marketers in their decision-making process.

The project involves several steps:

Data Collection: Gathering data from Instagram using APIs and web scraping techniques such as Selenium.
Data Processing and analysis:
1. Data Preprocessing: Cleaning and normalizing the data to ensure consistency and accuracy.
2. Data Exploration: Conducting statistical analyses on the dataset to uncover insights such as the distribution of post languages, top categories of content, and other relevant statistics.
Modeling and Evaluation
1. Model Training: Using NLP and Computer Vision algorithms to train models on the extracted features.
2. Model Fusion: Combining the outputs of two different models to improve classification accuracy.
3. Model Evaluation: Evaluating the performance of the models using metrics such as accuracy, precision, recall, and F1-score.
4. Visualization: Creating visualizations to represent the results and insights derived from the analysis.
RAG and GEMMA Insights: Utilizing RAG (Retrieval-Augmented Generation) and GEMMA to create a chatbot that can answer questions about Tunisian influencers. This chatbot leverages the data and insights gained from the project to provide meaningful responses to user queries.

Features

Automated Data Collection: Scripts to gather data from social media platforms.
Data Cleaning and Preprocessing: Tools to clean and prepare the data.
Data Exploration: Statistical analysis to understand the characteristics of the dataset.
Machine Learning Models: Implementation of various classification algorithms.
Model Fusion: Technique to combine the outputs of two models for better accuracy.
Performance Evaluation: Metrics to evaluate the effectiveness of the models.
Result Visualization: Tools to visualize the classification results.

Dataset

The dataset used in this project was collected for around 1000 Instagram influencers. It includes information on influencers' followers, followees, content types, bio and posts. Dataset

Project Report

The full project report can be found here

Installation

To get started with this project, clone the repository and install the required dependencies.

/~https://github.com/nessmahm/DeepLearning-For-Influencer-Classification
cd DeepLearning-For-Influencer-Classification

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
data-collection		data-collection
data-processing		data-processing
dataset		dataset
models		models
public		public
rag		rag
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Influencers Classification Project

Table of Contents

Introduction

Project Description

Features

Dataset

Project Report

Installation

About

Releases

Packages

Contributors 2

Languages

nessmahm/DeepLearning-For-Influencer-Classification

Folders and files

Latest commit

History

Repository files navigation

Influencers Classification Project

Table of Contents

Introduction

Project Description

Features

Dataset

Project Report

Installation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages