Ponzhi Classfication
The folder ponzhi_scheme_detection
contains all the files to generate the dataset and run the ponzhi_scheme_detection.
MAIN_DIR = ponzhi_scheme_detection/
DATA_DIR = ponzhi_scheme_detection/data/
TRANSACTIONS_DIR = ponzhi_scheme_detection/transactions/
Steps to run the code:
cd ponzhi_scheme_detection
- create
TRANSACTIONS_DIR
if it doesn't exist withmkdir transactions
. This folder will store all the transactions of the bitcoin addresses which are stored in theDATA_DIR
- The
all_addresses.csv
andponzi_32.csv
are taken from/~https://github.com/bitcoinponzi/BitcoinPonziTool/tree/master/CSV
- Run
python merge_addresses.py
located in MAIN_DIR to generatemerged_addresses.csv
in the DATA_DIR. Themerged_addresses
contains both the ponzi transactions and non ponzhi transaction addresses. We will use these addresses to get their respective transactions using the block explorer api - cd
data_collection
- Run
python save_transactions.py
to generate all the transactions of public addresses in json format in the TRANSACTIONS_DIR. - Since 5. takes a lot of time, it is best to change
MAX_DEGREE
hyperparameter to dowload transaction details of addresses with less transactions first. I have downloaded all the addresses with transactions less than 25k. -
- also generates a CSV in the DATA_DIR keeping track of files that have been succesfully downloaded. Using these transactions information, we will generate features to train Machine learning models.
- cd
../feature_generation
and runget_features.py
to generatefeatures.csv
in the DATA_DIR - cd
../
and run the notebookFeatures_EDA_and_data_transform.ipynb
to do EDA and generate transformed features - run
Classfication.ipynb
to train models for the transformed features