Name		Name	Last commit message	Last commit date
parent directory ..
src		src
README.md		README.md
run.py		run.py

README.md

Phase 1: Lexicon-Bottlenecked Pre-training

Pre-Training

mkdir "/path/to/save/ckpts"

CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python run.py with data_root="/path/to/data" \
    num_gpus=8 num_nodes=1 task_Text_MAE_Contrastive_train per_gpu_batchsize=100 \
    beit16_base224 text_bert image_size=224 vit_randaug batch_size=800 \
    log_dir="/path/to/save/ckpts" precision=16 max_epoch=20 learning_rate=5e-5

In the src/config.py, you can set which kinds of data you want to use for training.

datasets = ["f30k", "gcc", "sbu", "coco"]

In the src/datasets/base_dataset.py, you can set how many data you want to use for different datasets:

df = pd.read_csv(f"{self.data_dir}/{input_filename}", sep=sep, nrows=100000)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Phase1

Phase1

README.md

Phase 1: Lexicon-Bottlenecked Pre-training

Pre-Training

Files

Phase1

Directory actions

More options

Directory actions

More options

Latest commit

History

Phase1

Folders and files

parent directory

README.md

Phase 1: Lexicon-Bottlenecked Pre-training

Pre-Training