Mycroft AI
  • Documentation
  • About Mycroft AI
    • Why use Mycroft AI?
    • Glossary of terms
    • Contributing
    • FAQ
  • Using Mycroft AI
    • Get Mycroft
      • Mark II
        • Mark II Dev Kit
      • Mark 1
      • Picroft
      • Linux
      • Mac OS and Windows with VirtualBox
      • Docker
      • Android
    • Pairing Your Device
    • Basic Commands
    • Installing New Skills
    • Customizations
      • Configuration Manager
      • mycroft.conf
      • Languages
        • Français (French)
        • Deutsch (German)
      • Using a Custom Wake Word
      • Speech-To-Text
      • Text-To-Speech
    • Troubleshooting
      • General Troubleshooting
      • Audio Troubleshooting
      • Wake Word Troubleshooting
      • Log Files
      • Support Skill
      • Getting more support
  • Skill Development
    • Voice User Interface Design Guidelines
      • What can a Skill do?
      • Design Process
      • Voice Assistant Personas
      • Interactions
        • Intents
        • Statements and Prompts
        • Confirmations
      • Conversations
      • Error Handling
      • Example Interaction Script
      • Prototyping
      • Design to Development
    • Development Setup
      • Python Resources
      • Your First Skill
    • Skill Structure
      • Lifecycle Methods
      • Logging
      • Skill Settings
      • Dependencies
        • Manifest.yml
        • Requirements files
      • Filesystem access
      • Skill API
    • Integration Tests
      • Test Steps
      • Scenario Outlines
      • Test Runner
      • Reviewing the Report
      • Adding Custom Steps
      • Old Test System
    • User interaction
      • Intents
        • Padatious Intents
        • Adapt Intents
      • Statements
      • Prompts
      • Parsing Utterances
      • Confirmations
      • Conversational Context
      • Converse
    • Displaying information
      • GUI Framework
      • Show Simple Content
      • Mycroft-GUI on a PC
      • Mark 1 Display
    • Advanced Skill Types
      • Fallback Skill
      • Common Play Framework
      • Common Query Framework
      • Common IoT Framework
    • Mycroft Skills Manager
      • Troubleshooting
    • Marketplace Submission
      • Skills Acceptance Process
        • Information Review Template
        • Code Review Template
        • Functional Review Template
        • Combined Template
      • Skill README.md
    • FAQ
  • Mycroft Technologies
    • Technology Overview
    • Roadmap
    • Mycroft Core
      • MessageBus
      • Message Types
      • Services
        • Enclosure
        • Voice Service
        • Audio Service
        • Skills Service
      • Plugins
        • Audioservice Plugins
        • STT Plugins
        • TTS Plugins
        • Wake Word Plugins
      • Testing
      • Legacy Repo
    • Adapt
      • Adapt Examples
      • Adapt Tutorial
    • Lingua Franca
    • Mimic TTS
      • Mimic 3
      • Mimic 2
      • Mimic 1
      • Mimic Recording Studio
    • Mycroft GUI
      • Remote STT and TTS
    • Mycroft Skills Kit
    • Mycroft Skills Manager
    • Padatious
    • Precise
    • Platforms
Powered by GitBook
On this page
  • Default Engine
  • eSpeak
  • Mycroft Configuration
  • Mary TTS
  • Server Setup
  • Mycroft Configuration
  • FA TTS
  • Server Setup
  • Mycroft Configuration
  • Amazon Polly
  • Account Setup
  • Mycroft Configuration
  • Google TTS
  • Mycroft Configuration
  • IBM Watson
  • Account Setup
  • Mycroft Configuration
  • Microsoft Azure Cognitive Service
  • Installation
  • Account Setup
  • Mycroft Configuration
  • Microsoft Bing
  • Account Setup
  • Mycroft Configuration
  • Mozilla TTS
  • Server Setup
  • Mycroft Configuration
  • Coqui TTS
  • Server Setup
  • Mycroft Configuration
  • Responsive Voice
  • Installation
  • Mycroft Configuration
  • SpdSay
  • Software Setup
  • Mycroft Configuration
  • Yandex SpeechKit
  • Account Setup
  • Mycroft Configuration

Was this helpful?

  1. Using Mycroft AI
  2. Customizations

Text-To-Speech

Text-To-Speech (TTS) is the process of synthesizing audio from text. Mycroft uses our own TTS engines by default, however we also support a range of third party services.

PreviousSpeech-To-TextNextTroubleshooting

Last updated 3 years ago

Was this helpful?

Mycroft has two open source TTS engines.

Mimic 1 is a fast, light-weight engine based on . Whilst the original Mimic may sound more robotic, it is able to be synthesized on your device.

is an implementation of Tacotron speech synthesis. It is a fork of with additional tooling and code enhancements. Mimic 2 provides a much more natural sounding voice, however requires significant processing power to do so and is therefore cloud-based.

Default Engine

The engine that will be used depends on the voice selected in your .

Currently:

  • British Male is Mimic 1

  • American Female is Mimic 1

  • American Male is Mimic 2

  • Google Voice uses the Google Translate TTS API.

As Mimic 1 voices can be synthesized on device, the British Male voice will be used anytime the device cannot reach your preferred TTS service. This allows Mycroft to continue to speak even if it is not connected to a network.

eSpeak

A multi-lingual software speech synthesizer for Linux and Windows.

uses a "formant synthesis" method. This allows many languages to be provided in a small size. The speech is clear, and can be used at high speeds, but is not as natural or smooth as larger synthesizers which are based on human speech recordings.

Mycroft Configuration

First, ensure that the espeak package is installed on your system.

sudo apt-get install espeak
mycroft-config edit user

To our existing configuration values we will add the following:

"tts": {
  "module": "espeak",
  "espeak": {
    "lang": "english-us",
    "voice": "m1"
  }
}

Added in mycroft-core v21.2.2

You can further customize the amplitude, gap, capital, pitch and speed of espeak voices by adding them to your TTS configuration parameters.

Example Config:

{
  "max_allowed_core_version": 21.2,
  "lang": "de-de",
  "tts": {
    "module": "espeak",
    "espeak": {
      "lang": "german-mbrola-5",
      "voice": "german-mbrola-5",
      "speed": "135",
      "amplitude": "80",
      "pitch": "20"
    }
  }
}  

For more information on the values for these parameters see espeak --help

Mary TTS

The multilingual open-source MARY text-to-speech platform. MaryTTS is a client-server system written in pure Java, so it runs on many platforms.

Server Setup

Mycroft Configuration

mycroft-config edit user

To our existing configuration values we will add the following:

  "tts": {
    "marytts": {
      "url": "http://YOUR_SERVER:PORT_NUMBER"
    },
    "module": "marytts"
  }

FA TTS

Server Setup

Mycroft Configuration

mycroft-config edit user

To our existing configuration values we will add the following:

  "tts": {
    "fatts": {
      "url": "http://YOUR_SERVER:PORT_NUMBER"
    },
    "module": "fatts"
  }

Amazon Polly

Account Setup

You will need to take note of your private "Access Key ID" and "Secret Access Key".

Mycroft Configuration

Then, install the boto3 python module in the Mycroft virtual environment:

mycroft-pip install boto3

or

cd ~/mycroft-core
source .venv/bin/activate
pip3 install boto3
deactivate
mycroft-config edit user

To our existing configuration values we will add the following:

"tts": {
  "module": "polly",
  "polly": {
    "voice": "Matthew",
    "region": "us-east-1",
    "engine": "standard",
    "access_key_id": "YOUR_ACCESS_KEY_ID",
    "secret_access_key": "YOUR_SECRET_ACCESS_KEY"
  }
}

If the voice, region, and engine attributes are ommitted the defaults of Matthew, us-east-1 and standard will be used. This is an English (US) voice.

Google TTS

Google Translate's text-to-speech API.

Mycroft Configuration

mycroft-config set tts.module "google"

IBM Watson

Account Setup

IBM keeps a log of all requests in the lite plan unless you turn it off explicitly by setting "X-Watson-Learning-Opt-Out" to true. We have set Mycroft to Opt-Out by default, so if you want to share data with IBM then you must set this to false.

Mycroft Configuration

mycroft-config edit user

To our existing configuration values we will add the following:

"tts": {
  "module": "watson",
  "watson": {
    "voice":"PREFERRED_VOICE",
    "apikey": "YOUR_API_KEY",
    "url": "YOUR_API_URL",
    "X-Watson-Learning-Opt-Out": "true"
  }
}

Microsoft Azure Cognitive Service

Note: This is a Community provided TTS plugin and is not controlled by Mycroft AI. Updates for this plugin may not have been reviewed by the Mycroft team. We strongly recommend reviewing any code you intend to install from outside Mycroft's official channels.

Installation

mycroft-pip install mycroft-tts-plugin-azure

Account Setup

Mycroft Configuration

"tts": {
  "module": "azure_tts",
  "azure_tts": {
    "api_key": "insert_your_key_here",
    "voice": "en-US-JennyNeural", # optional, default "en-US-Guy24kRUS"
    "region": "westus" # optional, if your region is westus
  }
}

Microsoft Bing

Account Setup

Mycroft Configuration

mycroft-config edit user

To our existing configuration values we will add the following:

"tts": {
  "module": "bing",
  "bing": {
    "api_key": "YOUR_API_KEY",
    "format": "riff-16khz-16bit-mono-pcm",
    "gender": "Male"
  }
}

Mozilla TTS

Server Setup

Mycroft Configuration

mycroft-config edit user

To our existing configuration values we will add the following:

"tts": {
  "module": "mozilla",
  "mozilla": {
    "url": "http://my-mozilla-tts-server"
  }
}
mycroft-config set tts.module mozilla

Coqui TTS

Coqui TTS is an actively maintained fork of the Mozilla TTS project. A Coqui TTS server can be run locally without internet connection.

Server Setup

Coqui TTS is based on Python3 so it's recommended to setup a new virtual environment (venv) for the TTS server.

mkdir <TTS directory>
cd <TTS directory>
python3 -m venv .
source ./bin/activate

Then within that environment install the TTS server.

pip install pip --upgrade
pip install tts --upgrade

Running TTS server

To run the server we need to know two things:

  1. Which TTS model to use.

Running tts --list_models within the venv shows the TTS models available in the current release.

Example output:

tts_models/en/ek1/tacotron2
tts_models/es/mai/tacotron2-DDC
tts_models/fr/mai/tacotron2-DDC
tts_models/de/thorsten/tacotron2-DCA
...

Within the venv we can now start the TTS server by running:

tts-server --use_cuda=false/true --model_name *modelNameFromList* 
`

Example commands:

  • English: tts-server --use_cuda=true --model_name tts_models/en/ek1/tacotron2

  • German: tts-server --use_cuda=true --model_name tts_models/de/thorsten/tacotron2-DCA

By default a Coqui TTS server uses the best vocoder for the selected TTS model. However you can override the default using the --vocoder_name parameter when starting your server.

Once the TTS server is running you can test it by opening http://localhost:5002 in your browser and try synthesizing a test sentence.

Mycroft Configuration

Responsive Voice

Note: This is a Community provided TTS plugin and is not controlled by Mycroft AI. The code in this plugin has not been reviewed by the Mycroft team. We strongly recommend reviewing any code you intend to install from outside Mycroft's official channels.

Installation

mycroft-pip install ovos-tts-plugin-responsivevoice

Mycroft Configuration

mycroft-config edit user

See the plugin's Github repository for suggested configurations:

SpdSay

Software Setup

Install the speech-dispatcher package using your systems package manager. For example: sudo apt-get install speech-dispatcher

Mycroft Configuration

mycroft-config set tts.module "spdsay"

Yandex SpeechKit

Speech services from Yandex, one of the largest cloud platforms in Russia.

Account Setup

  1. Register an account at Yandex.

  2. You can activate a free trial period in the console.

  3. Create first "folder" in cloud.

Mycroft Configuration

mycroft-config edit user

To our existing configuration values we will add the following:

"tts": {
  "module": "yandex",
  "yandex": {
    "lang": "en-US",
    "api_key": "YOUR_API_KEY",
    "voice": "oksana", #optional
    "emotion": "good" #optional
  }
}

Then, using the we can edit the mycroft.conf file by running:

The latest installation instructions can be found on the .

Using the we can edit the mycroft.conf file by running:

Produced by , it is based off Mary TTS.

The latest installation instructions can be found on the .

Using the we can edit the mycroft.conf file by running:

text-to-speech service.

and add the Polly service.

First, check the . Note that Polly does not provide a separate language attribute like other TTS options. The language is determined by which voice is chosen.

Finally, using the we can edit the mycroft.conf file by running:

The Google TTS module uses the gTTS Python package which interfaces with the Google Translate text-to-speech API. This is not intended for commercial or production usage. The service may break at any time, and you are subject to their Terms of Service that can be found at:

Using the we can edit the mycroft.conf file by running:

Create an account at . Once you add the TTS service to your account, you will receive an API key and unique API url.

You can find a list of available voices at . For example, "en-US_MichaelV3Voice".

Using the we can edit the mycroft.conf file by running:

This TTS service requires a subscription to Microsoft Azure and the creation of a Speech resource () The free plan is more than able to handle domestic usage (5 million character per month, or 0.5 million with neural TTS voice)

You can choose your voice here in the column "voice name" () Neural voices are much better, but cost more.

Create a and get a server access token.

Using the we can edit the mycroft.conf file by running:

Instructions for setting up a Mozilla TTS server are .

Using the we can edit the mycroft.conf file by running:

By default the url is set to the localhost: So if you are running the server on the same machine as your Mycroft instance, only the module attribute needs to be set. This can also be done with a single command:

Pretrained TTS models are available based on open voice datasets (eg. LJSpeech, LibriTTS, Thorsten-DE, Mai, ...). The shows a complete list of available TTS models.

Insallation of the tts python package does not yet work with Python 3.10. .

Whether we have a CUDA enabled GPU. Synthesizing voice is significantly faster when run on a enabled GPU compared to a CPU.

After your TTS server setup is finished you can to use it with the same configuration as Mozilla TTS.

Lifelike human digital voices from .

Using the we can edit the mycroft.conf file by running:

A common high-level interface to speech synthesis from .

Using the we can edit the mycroft.conf file by running:

Create billing account:

Create service account for you Mycroft instance with role editor:

Create API key for service account:

Using the we can edit the mycroft.conf file by running:

Carnegie Mellon University's FLITE software
Mimic 1
Mimic 2
Keith Ito's project
Device Settings at Home.mycroft.ai
eSpeak
Configuration Manager
MaryTTS Github repository
Configuration Manager
Mivoq
Mivoq FA TTS Github repository
Configuration Manager
Amazon Polly's
Create an AWS account
list of available voices and languages
Configuration Manager
https://policies.google.com/terms
Configuration Manager
IBM.com/cloud
Languages and Voices
Configuration Manager
https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/overview#create-the-azure-resource
https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/language-support#text-to-speech
Microsoft Azure account
Configuration Manager
available on the projects wiki
Configuration Manager
http://0.0.0.0:5002
Coqui release page
See the Coqui TTS issues page for more information
CUDA
ResponsiveVoice.org
Configuration Manager
https://github.com/OpenVoiceOS/ovos-tts-plugin-responsivevoice#configuration
Free(B)Soft
Configuration Manager
https://cloud.yandex.com/docs/billing/quickstart/#create_billing_account
https://cloud.yandex.com/docs/iam/operations/sa/create
https://cloud.yandex.com/docs/iam/operations/api-key/create
Configuration Manager
configure Mycroft