Docker Setup for TTS (Text-to-speech) Service - Converting Text Content to Speech

Edited on 2024-10-25 In Docker Views: Reading time ≈ 5 mins.

Converting textual content to speech, TTS, generating Chinese speech, reading aloud the entire text, and supporting multiple languages.

Purpose

Converting textual content to speech, converting website content to speech, speech synthesis, generating Chinese speech.

How to Use

<div> 
<button onclick="synthesizeSpeech()">Read Aloud Entire Text</button>
</div>
<audio controls id="audioPlayer">Your browser does not support the audio element.</audio>      
<script>
  function synthesizeSpeech() { 
    var inputText = document.getElementsByClassName('post-block')[0].innerText;
    var voice = "ZH";
    var url = 'https://tts.carlzeng.com:3/speech?text=' + encodeURIComponent(inputText) + '&voice=' + voice;
    var audioPlayer = document.getElementById('audioPlayer');          
    audioPlayer.src = url;
    audioPlayer.load();
    audioPlayer.play();
  }
</script>

Add the above HTML content to your HTML page (or template).
Edit/modify the inputText content source to the text content you want to read aloud.

Implementation Process

TTS docker, eSpeak TTS server for WebSpeech.

There are many solutions, but none are user-friendly or support Mandarin Chinese.

https://github.com/synesthesiam/opentts
docker run -it -p 5500:5500 synesthesiam/opentts:zh –no-espeak
Drawback: Does not support Mandarin Chinese, cannot include English.

Preview the full process Setup a Text to Speech Engine (ON YOUR COMPUTER)

New Solution:

Text-to-speech server
https://github.com/parente/espeakbox

http://192.168.6.203:8089/speech?text=排查Nginx Proxy Manager，反向代理，让网站变成HTTPS&voice=ZH

1 2	Text-to-speech server https://github.com/parente/espeakbox

Docker-specific Location

https://hub.docker.com/r/parente/espeakbox

Next Steps

Seeking a more accurate AI model for speech synthesis to make Chinese reading more pleasant, for instance, the speech synthesis from Xunfei is much better than this…

Source of Inspiration

https://github.com/parente/espeakbox

https://github.com/kripken/speak.js

Incorporating screen time management app into the operating system, such as the “Anti-Addiction System” on the Xiaomi K40.

Android screen time management app for adolescents.

ChatGPT speech plugin
 
Azure Text to Speech API
 
TTS docker