A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)
-
Updated
Feb 3, 2025 - Python
A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)
streamline the fine-tuning process for multimodal models: PaliGemma 2, Florence-2, and Qwen2.5-VL
JoyCaption is an image captioning Visual Language Model (VLM) being built from the ground up as a free, open, and uncensored model for the community to use in training Diffusion models.
Code for "Aligning Linguistic Words and Visual Semantic Units for Image Captioning", ACM MM 2019
CapDec: SOTA Zero Shot Image Captioning Using CLIP and GPT2, EMNLP 2022 (findings)
Audio Captioning datasets for PyTorch.
Fully-Convolutional Point Networks for Large-Scale Point Clouds
A Tennis dataset and models for event detection & commentary generation
Python code for handling the Clotho dataset.
A Base Tensorflow Project for Medical Report Generation
What and How Well You Performed? A Multitask Learning Approach to Action Quality Assessment [CVPR 2019]
Positive-Augmented Contrastive Learning for Image and Video Captioning Evaluation. CVPR 2023
[CVPR21] Visual Semantic Role Labeling for Video Understanding (https://arxiv.org/abs/2104.00990)
A gradio based image captioning tool that uses the GPT-4-Vision API to generate detailed descriptions of images.
Metrics for evaluating Automated Audio Captioning systems, designed for PyTorch.
A Pytorch implementation of Attention on Attention module (both self and guided variants), for Visual Question Answering
Using LLMs and pre-trained caption models for super-human performance on image captioning.
Audio captioning baseline system for DCASE 2020 challenge.
[CVPR 2022] X-Trans2Cap: Cross-Modal Knowledge Transfer using Transformer for 3D Dense Captioning
CaMEL: Mean Teacher Learning for Image Captioning. ICPR 2022
Add a description, image, and links to the captioning topic page so that developers can more easily learn about it.
To associate your repository with the captioning topic, visit your repo's landing page and select "manage topics."