Markdown to ElevenLabs is an open-source project that converts Markdown files into high-quality voiceovers using the ElevenLabs Text-to-Speech API. The project is designed for creating natural-sounding audio from written content, ideal for podcasts, audiobooks, and more.
- Markdown Parsing: Splits Markdown files into sections, intelligently grouping paragraphs and lists.
- Text-to-Speech Conversion: Uses ElevenLabs API to generate realistic voiceovers with customizable voice settings.
- Audio Processing: Combines generated audio sections into a single cohesive file, complete with natural pauses.
- Flexible Operation Modes:
- Process Markdown only (
--markdown-only
). - Generate audio only (
--audio-only
). - Combine existing audio files only (
--combine-only
).
- Process Markdown only (
- Error Handling and Preprocessing:
- Cleans and normalizes text for smooth audio generation.
- Handles special characters (
—
,…
, etc.) gracefully.
- Python 3.8 or newer
- ElevenLabs API Key (Get yours here)
- Dependencies:
- Install
pydub
for audio manipulation. - Install
dotenv
for environment variable management. - Install
unidecode
for text normalization.
- Install
pip install -r requirements.txt
Ensure the following dependencies are listed in your requirements.txt
:
elevenlabs
pydub
python-dotenv
unidecode
The CLI command to copy .env.example
to .env
is:
cp .env.example .env
copy .env.example .env
Copy-Item .env.example .env
- Log in to your ElevenLabs account and generate an API key.
- Copy your API key and voice ID to the
.env
file:ELEVENLABS_API_KEY=your_api_key_here ELEVENLABS_VOICE_ID=your_voice_id_here
Place your Markdown files in the markdown
folder located in the project root. Before running the script, review and edit these files as needed:
- Remove any content you don't want included, such as:
- Code blocks
- Tables of contents
- Superfluous headings or sections
- Ensure the text is structured logically for audio generation.
After editing, the script will process the Markdown files and split them into individual sections for audio conversion.
python main.py [options]
Option | Description |
---|---|
--reset |
Deletes previous output files and starts fresh. |
--audio-only |
Skips Markdown processing and generates audio for existing Markdown sections. |
--markdown-only |
Processes Markdown files only, without generating audio. |
--combine-only |
Combines existing audio files into a single cohesive file. |
--voice-id |
Specify a voice ID (overrides .env ). |
--api-key |
Specify an API key (overrides .env ). |
python main.py --reset
python main.py --markdown-only
python main.py --audio-only
python main.py --combine-only
markdown-to-elevenlabs/
├── main.py # Main entry point for the program
├── src/
│ ├── split_markdown.py # Splits Markdown files into sections
│ ├── build_output.py # Handles audio generation and combination
├── markdown/ # Input Markdown files
├── output/
│ ├── markdown/ # Processed Markdown sections
│ ├── audio/ # Generated audio files
├── .env # Environment variables
├── requirements.txt # Python dependencies
├── README.md # Project documentation
Contributions are welcome! To get started:
- Fork the repository.
- Create a new branch for your feature or bugfix.
- Submit a pull request with a detailed explanation.
This project is licensed under the MIT License.
- ElevenLabs for their industry-leading Text-to-Speech API.
- pydub for seamless audio processing.
- open-source contributors for making projects like this possible!