Mycroft AI
  • Documentation
  • About Mycroft AI
    • Why use Mycroft AI?
    • Glossary of terms
    • Contributing
    • FAQ
  • Using Mycroft AI
    • Get Mycroft
      • Mark II
        • Mark II Dev Kit
      • Mark 1
      • Picroft
      • Linux
      • Mac OS and Windows with VirtualBox
      • Docker
      • Android
    • Pairing Your Device
    • Basic Commands
    • Installing New Skills
    • Customizations
      • Configuration Manager
      • mycroft.conf
      • Languages
        • Français (French)
        • Deutsch (German)
      • Using a Custom Wake Word
      • Speech-To-Text
      • Text-To-Speech
    • Troubleshooting
      • General Troubleshooting
      • Audio Troubleshooting
      • Wake Word Troubleshooting
      • Log Files
      • Support Skill
      • Getting more support
  • Skill Development
    • Voice User Interface Design Guidelines
      • What can a Skill do?
      • Design Process
      • Voice Assistant Personas
      • Interactions
        • Intents
        • Statements and Prompts
        • Confirmations
      • Conversations
      • Error Handling
      • Example Interaction Script
      • Prototyping
      • Design to Development
    • Development Setup
      • Python Resources
      • Your First Skill
    • Skill Structure
      • Lifecycle Methods
      • Logging
      • Skill Settings
      • Dependencies
        • Manifest.yml
        • Requirements files
      • Filesystem access
      • Skill API
    • Integration Tests
      • Test Steps
      • Scenario Outlines
      • Test Runner
      • Reviewing the Report
      • Adding Custom Steps
      • Old Test System
    • User interaction
      • Intents
        • Padatious Intents
        • Adapt Intents
      • Statements
      • Prompts
      • Parsing Utterances
      • Confirmations
      • Conversational Context
      • Converse
    • Displaying information
      • GUI Framework
      • Show Simple Content
      • Mycroft-GUI on a PC
      • Mark 1 Display
    • Advanced Skill Types
      • Fallback Skill
      • Common Play Framework
      • Common Query Framework
      • Common IoT Framework
    • Mycroft Skills Manager
      • Troubleshooting
    • Marketplace Submission
      • Skills Acceptance Process
        • Information Review Template
        • Code Review Template
        • Functional Review Template
        • Combined Template
      • Skill README.md
    • FAQ
  • Mycroft Technologies
    • Technology Overview
    • Roadmap
    • Mycroft Core
      • MessageBus
      • Message Types
      • Services
        • Enclosure
        • Voice Service
        • Audio Service
        • Skills Service
      • Plugins
        • Audioservice Plugins
        • STT Plugins
        • TTS Plugins
        • Wake Word Plugins
      • Testing
      • Legacy Repo
    • Adapt
      • Adapt Examples
      • Adapt Tutorial
    • Lingua Franca
    • Mimic TTS
      • Mimic 3
      • Mimic 2
      • Mimic 1
      • Mimic Recording Studio
    • Mycroft GUI
      • Remote STT and TTS
    • Mycroft Skills Kit
    • Mycroft Skills Manager
    • Padatious
    • Precise
    • Platforms
Powered by GitBook
On this page
  • Can Mycroft run completely offline? Can I self-host everything?
  • Why does Mycroft not recognize my voice?
  • How fast can Mycroft respond?
  • How can I create my own TTS voice?
  • The Mycroft AI Store
  • Do you cover VAT / GST / other import taxes?

Was this helpful?

  1. About Mycroft AI

FAQ

Questions that we frequently receive about Mycroft and our technologies.

PreviousContributingNextGet Mycroft

Last updated 3 years ago

Was this helpful?

Can Mycroft run completely offline? Can I self-host everything?

As a privacy focused project and community, many people are interested in fully offline or self-hosted options. Mycroft has intentionally been built in a modular fashion, so this is possible however is not easy and is unlikely to provide an equivalent user experience.

To achieve this we need to look at three key technologies: backend services provided by ; speech recognition or speech-to-text (STT); and speech-synthesis or text-to-speech (TTS). For backend services, the under the AGPL v3.0 license, alternatively you can use the simpler Community developed . You can choose to run your own STT service such as or , however in our opinion these do not yet provide sufficient accuracy for mainstream usage. Finally, to generate speech on device, simply select the British Male voice. The more realistic sounding voices are generated on Mycroft servers and require significant hardware to synthesize speech within a reasonable time frame.

If you are running your own services, your Mycroft installation can be directed to use those using the .

Why does Mycroft not recognize my voice?

When you trigger a device using the wake word (eg Hey Mycroft), this is using one of two systems. is trained of samples of other people saying the same thing. Anytime it hears something, it then reports its confidence that it was the wake word.

The training data we have has been collected from our existing community of users and a large proportion of these are adult males from the mid-west of the USA. Because of this bias in our data, there is also a bias in our wake word models. We are working to fix this, however currently it means that Mycroft has more difficulty hearing the wake word from women, children, and those with other accents.

You can increase or decrease the likelihood that it will report a match, however this requires some experimentation. You may end up with a lot of false activations, or it may stop responding at all.

If you are running Mycroft on older hardware, it's also possible that Precise is not supported and the system has fallen back to using . This fallback system is not as accurate and results vary wildly. You can find out which system Mycroft is using by asking:

Hey Mycroft, what is the active listener?

If Mycroft never activates at all, there might be an issue with your microphone. For this, check out our audio troubleshooting guide:

How fast can Mycroft respond?

By default, to answer a request Mycroft:

  1. Detects the wake word

  2. Records 3 - 10 seconds of audio

  3. Sends this audio to a cloud-based speech-to-text (STT) service

  4. Transcribes the audio and returns the text transcription

  5. Parses the text to understand the intent

  6. Sends the text to the intent handler with the highest confidence

  7. Allows the Skill to perform some action and provide the text to be spoken

  8. Synthesizes audio from the given text, either locally or remotely, depending on the text-to-speech (TTS) engine in use

  9. Plays the synthesized spoken audio.

Through this process there are a number of factors that can affect the perceived speed of Mycroft's responses:

  • System resources - more processing power and memory never hurts!

  • Network latency - as it is not yet possible to perform everything on device, network latency and connection speed can play a significant role in slowing down response times.

  • Dialog structure - a long sentence will always take more time to synthesize than a short one. For this reason Mycroft breaks up longer dialog into chunks and returns one to speak whilst the next is being generated. Skill developers can help provide quicker response times by considering the structure of their dialog and breaking that dialog up using punctuation in appropriate places.

  • TTS Caching - synthesized audio is cached meaning common recently generated phrases don't need to be generated, they can be returned immediately.

How can I create my own TTS voice?

It is worth noting that it is a significant investment of time to train your own TTS model. We strongly recommend watching Thorsten's entire video before you get started. If a 1 hour video is too long, be warned that the process will take a minimum of weeks and more likely months.

There are exciting new projects that may soon enable us to generate new voices based off minutes of recorded audio. However currently it requires 16+ hours of very consistent, high-quality audio, with the associated text metadata.

The Mycroft AI Store

Do you cover VAT / GST / other import taxes?

No. Purchases from Mycroft do not currently include any taxes or other importation fees. Unless otherwise stated, all products are shipped from the USA. This means that a product being shipped to another country may incur additional taxes and import fees. These are the sole responsibility of the customer and Mycroft will not reimburse any costs associated with these local fees and taxes.

Streaming STT - we have been experimenting with the use of streaming services. This transcribes audio as it's received rather than waiting for the entire utterance to be finished and sending the resulting audio file to a server to be processed in its entirety. It is possible to switch to a streaming STT service however at present this is not available by default and requires a paid 3rd party service. See for a list of options available.

The best answer is provided by who documented their journey to create a custom TTS model in German.

To capture this training data we have the . Note that this generates audio files, which can be used to train TTS models using a range of technologies, not just Mycroft's Mimic.

Home.mycroft.ai
official backend known as Selene is available on Github
Personal Backend
Mozilla DeepSpeech
Kaldi
mycroft.conf file
Precise
Using a Custom Wake Word
PocketSphinx
Audio Troubleshooting
Switching STT Engines
@Thorsten
Mimic Recording Studio