Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TTS Timeout Error #2747

Open
xings19 opened this issue Feb 17, 2025 · 1 comment
Open

TTS Timeout Error #2747

xings19 opened this issue Feb 17, 2025 · 1 comment
Assignees
Labels
in-review In review pending close Closed soon without new activity python Pull requests that update Python code text-to-speech Text-to-Speech

Comments

@xings19
Copy link

xings19 commented Feb 17, 2025

I write a func to convert text to audio, but I always meet this error:

def conver_text_to_audio(text, save_dir, lang=None):
    if lang is None:
        lang = compare_en_cn_chars(text)
    sha256_text = sha256(text.encode()).hexdigest()
    audio_output_path = os.path.join(save_dir,'{}.wav'.format(sha256_text))
    if os.path.exists(audio_output_path):
        return
    max_try_num = 3
    count = 0
    while True:
        if count >= max_try_num:
            return
        if lang == 'en':
            voice_name = random.choice(ENGLISH_VOICE_NAME)
        elif lang == 'zh':
            voice_name = random.choice(CHINESE_VOICE_NAME)
        else:
            raise ValueError('Unknown language code')
        speech_config = speechsdk.SpeechConfig(subscription='mykey', region='eastus')
        audio_config = speechsdk.audio.AudioOutputConfig(filename=audio_output_path)
        speech_config.speech_synthesis_voice_name = voice_name
        speech_config.enable_audio_logging()
        speech_synthesizer = speechsdk.SpeechSynthesizer(speech_config=speech_config, audio_config=audio_config)
        speech_synthesis_result = speech_synthesizer.speak_text_async(text).get()
        if speech_synthesis_result.reason == speechsdk.ResultReason.SynthesizingAudioCompleted:
            return
        elif speech_synthesis_result.reason == speechsdk.ResultReason.Canceled:
            cancellation_details = speech_synthesis_result.cancellation_details
            print("Speech synthesis canceled: {}".format(cancellation_details.reason))
            if cancellation_details.reason == speechsdk.CancellationReason.Error:
                if cancellation_details.error_details:
                    print("Error details: {}".format(cancellation_details.error_details))
                    print("Did you set the speech resource key and region values?")
            print("Failed to synthesize audio, retrying...")
        count += 1
Speech synthesis canceled: CancellationReason.Error
Error details: Timeout while synthesizing. Current RTF: 2.59965 (threshold 2), frame interval 3013ms (threshold 3000ms). USP state: ReceivingData. Received audio size: 37088 bytes.
Did you set the speech resource key and region values?
Failed to synthesize audio, retrying...
Speech synthesis canceled: CancellationReason.Error
Error details: Timeout while synthesizing. Current RTF: 2.30968 (threshold 2), frame interval 3008ms (threshold 3000ms). USP state: ReceivingData. Received audio size: 78036 bytes.
Did you set the speech resource key and region values?
Failed to synthesize audio, retrying...
Speech synthesis canceled: CancellationReason.Error
Error details: Timeout while synthesizing. Current RTF: 2.42203 (threshold 2), frame interval 3013ms (threshold 3000ms). USP state: ReceivingData. Received audio size: 39818 bytes.
Did you set the speech resource key and region values?
Failed to synthesize audio, retrying...
Speech synthesis canceled: CancellationReason.Error
Error details: Timeout while synthesizing. Current RTF: 2.42203 (threshold 2), frame interval 3013ms (threshold 3000ms). USP state: ReceivingData. Received audio size: 39818 bytes.
Did you set the speech resource key and region values?

My quota is standard S0

@pankopon
Copy link
Contributor

Hi, the description does not say what was used for

  • Speech SDK version
  • input text
  • voice name

and what the Python version and OS platform were.

I tested the code with

sentence = "The quick brown fox jumped over the lazy watchdog's head. "
conver_text_to_audio(sentence, "C:/Temp", "en")

and

long_sentence = sentence * 150
conver_text_to_audio(long_sentence, "C:/Temp", "en")

using voice_name="en-US-AriaNeural" and the latest Speech SDK 1.42.0 release, on WinPython 3.12.5, with success.

Please verify that your subscription key is indeed working, run e.g.
/~https://github.com/Azure-Samples/cognitive-services-speech-sdk/tree/master/quickstart/python/text-to-speech

Note that enable_audio_logging() is only for audio logging on the service. Local Speech SDK logging is enabled differently: https://learn.microsoft.com/azure/ai-services/speech-service/how-to-use-logging

@yulin-li for follow-up and/or in case the issue is somehow 'eastus' specific.

@pankopon pankopon added in-review In review text-to-speech Text-to-Speech pending close Closed soon without new activity python Pull requests that update Python code labels Feb 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
in-review In review pending close Closed soon without new activity python Pull requests that update Python code text-to-speech Text-to-Speech
Projects
None yet
Development

No branches or pull requests

3 participants