In this guide, I will cover the best Open-Source Text-to-Speech or TTS tech that you can run yourself free of cost.
This post will cover various TTS technologies at a high level. I will post individual guides for each of them in the next few days and link them here.
Let’s dive in.
Table of Contents
Mozilla TTS is an open-source text-to-speech library from Mozilla org, the makers of popular browsers like Mozilla, Firefox, etc.
It is one of the best open-source text-to-speech AI techs available right now.
You can use it out of the box, to generate voice from the text as well as use it to train on new voice samples.
Tortoise is a text-to-speech program that has multiple voices and produces natural-sounding prosody and intonation. You can get the code from here to run it on your own.
Mimic 3 by Mycroft AI
Mimic 3 is an open-source text-to-speech engine that focuses on privacy. It produces high-quality speech and can run without an internet connection on your own hardware. A cloud service is being developed for people who want a simpler option or for hardware that cannot handle the processing demands.
Coqui TTS is an open-source TTS engine released by Coqui. They have both free, open source, and in-the-cloud paid options.
eSpeak NG Text-to-speech
The eSpeak NG is a compact open-source text-to-speech synthesizer for Linux, Windows, Android, and other operating systems. It supports more than 100 languages and accents. It is based on the eSpeak engine created by Jonathan Duddington.
Larynx is an offline end-to-end text-to-speech system has a total of 50 voices available in 9 different languages. It is designed to operate entirely offline and provides a complete solution for converting text to speech.
Festival is a speech synthesis tool that converts text to speech through various APIs including the command line, a Scheme interpreter, a C++ library, and Java and Emacs interfaces. It supports multiple languages, including English and Spanish, and includes tools and documentation for creating new voices. Festival is written in C++ and uses the Edinburgh Speech Tools Library, and it is provided under an X11 license which allows for both commercial and non-commercial use.
The Festival was created at the University of Edinburgh.
The pyttsx3 is a python module that lets you use multiple TTS engines to do offline text-to-speech synthesis in python.