mimicking - English Pronunciation Improvement App

This is a simple Python application powered by AI and designed to help you improve your English pronunciation by mimicking native speakers. The app runs locally, so your data never leaves your machine.

The core feature is a pronunciation comparison between your speech and the reference pronunciation from either an AI assistant or a source audio file. The similarity is calculated using the Word Error Rate (WER), which measures how close your pronunciation is to the reference.

WER Scale

Range	Description	Quality
0 - 20%	Excellent	Very close to native pronunciation.
20 - 40%	Good	Some minor mispronunciations, but generally understandable.
40 - 60%	Fair	Noticeable errors; pronunciation may sound foreign but is mostly understandable.
60% and above	Poor	Significant pronunciation issues, making it difficult to understand.

Features

Transcription: Wav2Vec2 model.
Text-to-Speech: Coqui.AI TTS.
English Phonemes: CMU Pronouncing Dictionary.
Data: Your performance data is stored in SQLite3 databases, with separate databases for each mode. Over time, this setup will allow you to query and analyze your progress effectively.

Getting Started

Environment Setup

The app was developed using Python 3.11.2 on Debian GNU/Linux 12 (Bookworm). If you're using a different operating system, you might need to make some adjustments to suit your environment.

Clone this repo and you'll can set up the app locally by following these commands:

python -m venv venv

. venv/bin/activate

pip install -r requirements.txt

Install youtube-dl, if you wanna use youtube videos. You can use these commands:

sudo curl -L /~https://github.com/ytdl-org/ytdl-nightly/releases/download/2024.07.07/youtube-dl -o /usr/local/bin/youtube-dl

sudo chmod a+rx /usr/local/bin/youtube-dl

Configuration

To personalize the setup, you'll need to configure a few variables in the settings.py file.

Permissions

Ensure the scripts are executable by running:

chmod +x <script>

Input Options

You can provide input in one of two ways:

Text File: A file with one phrase per line.
Audio Folder: A folder containing *.wav audio files.

Download and Split YouTube Audio:

The ./download_and_split_ytb <textfile.txt> script allows you to download audio from YouTube videos. It will split the audio into 3-second segments, which you can directly edit in the script. The text file should contain URLs to YouTube videos, one per line.

Text File Format

The text file should contain one phrase per line. For example:

How are you today?
I am learning to speak English.

Running the App

To run the app, use the following command in any shell:

python main.py --mode <audio | text>

Troubleshooting

If you encounter any issues, please check the following:

Ensure the necessary dependencies are installed.
Verify your configuration in settings.py.
Make sure the scripts have the appropriate executable permissions.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
adapters		adapters
application		application
assets		assets
shared		shared
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
audio.sqlite3-query		audio.sqlite3-query
download_and_split_ytb		download_and_split_ytb
format_text_file_utils.py		format_text_file_utils.py
main.py		main.py
references.md		references.md
run_tests		run_tests
settings.py		settings.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

mimicking - English Pronunciation Improvement App

WER Scale

Features

Getting Started

Environment Setup

Configuration

Permissions

Input Options

Download and Split YouTube Audio:

Text File Format

Running the App

Troubleshooting

About

Languages

License

thiagogre/mimicking

Folders and files

Latest commit

History

Repository files navigation

mimicking - English Pronunciation Improvement App

WER Scale

Features

Getting Started

Environment Setup

Configuration

Permissions

Input Options

Download and Split YouTube Audio:

Text File Format

Running the App

Troubleshooting

About

Topics

Resources

License

Stars

Watchers

Forks

Languages