-
Microsoft
- Redmond
- https://soham97.github.io
- @sohamdesh_
Highlights
- Pro
Stars
Unified automatic quality assessment for speech, music, and sound.
Code for the paper: MACE: Leveraging Audio for Evaluating Audio Captioning Systems
Audio Entailment: Deductive Reasoning for Audio Understanding
Awesome speech/audio LLMs, representation learning, and codec models
A simple library for Fréchet Audio Distance (FAD) calculation
PAM is a no-reference audio quality metric for audio generation tasks
Repository for "Training Audio Captioning Models without Audio"
Tracking states of the arts and recent results (bibliography) on sound tasks.
Web-crawl for "Audio Retrieval with WavText5K and CLAP Training"
Learning audio concepts from natural language supervision
Code repo for "Multi-Task Learning for Interpretable Weakly Labelled Sound Event Detection"
speech enhancement\speech seperation\sound source localization
Self-Supervised Speech Pre-training and Representation Learning Toolkit
Reading list for research topics in Sound AI