Speech synthesis (TTS) in low-resource languages by training from scratch with Fastpitch and fine-tuning with HifiGan
-
Updated
Dec 5, 2023 - Python
Speech synthesis (TTS) in low-resource languages by training from scratch with Fastpitch and fine-tuning with HifiGan
Automatic transcriber made with the Nvidia NeMo AI toolkit. Used to transcribe speech to text in real-time from any source. Requires CUDA capable GPU to run on the local machine, if setup using virtual audio cables can transcribe the audio that is being played in real-time without any other requirements.
The simplest & most comprehensible tutorial on speaker identification with NVIDIA's `Nemo`.
Implementation of a Kazakh Speech-to-Text Model using the NVIDIA NeMo toolkit for efficient transcription of spoken Kazakh speech into text.
Module for russian speech recognition using NVIDIA Nemo.
Audio profanity detector desktop app developed with PyQt5 using NVidia-Nemo tech.
PodcastProject Analytics Toolkit - Project that creates analytics various input data. Exported data is intended to be used in a PodcastProject website
Add a description, image, and links to the nvidia-nemo topic page so that developers can more easily learn about it.
To associate your repository with the nvidia-nemo topic, visit your repo's landing page and select "manage topics."