Mycroft AI
  • Documentation
  • About Mycroft AI
    • Why use Mycroft AI?
    • Glossary of terms
    • Contributing
    • FAQ
  • Using Mycroft AI
    • Get Mycroft
      • Mark II
        • Mark II Dev Kit
      • Mark 1
      • Picroft
      • Linux
      • Mac OS and Windows with VirtualBox
      • Docker
      • Android
    • Pairing Your Device
    • Basic Commands
    • Installing New Skills
    • Customizations
      • Configuration Manager
      • mycroft.conf
      • Languages
        • Français (French)
        • Deutsch (German)
      • Using a Custom Wake Word
      • Speech-To-Text
      • Text-To-Speech
    • Troubleshooting
      • General Troubleshooting
      • Audio Troubleshooting
      • Wake Word Troubleshooting
      • Log Files
      • Support Skill
      • Getting more support
  • Skill Development
    • Voice User Interface Design Guidelines
      • What can a Skill do?
      • Design Process
      • Voice Assistant Personas
      • Interactions
        • Intents
        • Statements and Prompts
        • Confirmations
      • Conversations
      • Error Handling
      • Example Interaction Script
      • Prototyping
      • Design to Development
    • Development Setup
      • Python Resources
      • Your First Skill
    • Skill Structure
      • Lifecycle Methods
      • Logging
      • Skill Settings
      • Dependencies
        • Manifest.yml
        • Requirements files
      • Filesystem access
      • Skill API
    • Integration Tests
      • Test Steps
      • Scenario Outlines
      • Test Runner
      • Reviewing the Report
      • Adding Custom Steps
      • Old Test System
    • User interaction
      • Intents
        • Padatious Intents
        • Adapt Intents
      • Statements
      • Prompts
      • Parsing Utterances
      • Confirmations
      • Conversational Context
      • Converse
    • Displaying information
      • GUI Framework
      • Show Simple Content
      • Mycroft-GUI on a PC
      • Mark 1 Display
    • Advanced Skill Types
      • Fallback Skill
      • Common Play Framework
      • Common Query Framework
      • Common IoT Framework
    • Mycroft Skills Manager
      • Troubleshooting
    • Marketplace Submission
      • Skills Acceptance Process
        • Information Review Template
        • Code Review Template
        • Functional Review Template
        • Combined Template
      • Skill README.md
    • FAQ
  • Mycroft Technologies
    • Technology Overview
    • Roadmap
    • Mycroft Core
      • MessageBus
      • Message Types
      • Services
        • Enclosure
        • Voice Service
        • Audio Service
        • Skills Service
      • Plugins
        • Audioservice Plugins
        • STT Plugins
        • TTS Plugins
        • Wake Word Plugins
      • Testing
      • Legacy Repo
    • Adapt
      • Adapt Examples
      • Adapt Tutorial
    • Lingua Franca
    • Mimic TTS
      • Mimic 3
      • Mimic 2
      • Mimic 1
      • Mimic Recording Studio
    • Mycroft GUI
      • Remote STT and TTS
    • Mycroft Skills Kit
    • Mycroft Skills Manager
    • Padatious
    • Precise
    • Platforms
Powered by GitBook
On this page
  • Mycroft components
  • Wake Word detection
  • Speech to Text (STT)
  • Intent parser
  • Text to Speech
  • Middleware
  • Mycroft Skills
  • Devices and Enclosures

Was this helpful?

  1. Mycroft Technologies

Technology Overview

A broad overview of the technology that makes up Mycroft AI.

PreviousFAQNextRoadmap

Last updated 4 years ago

Was this helpful?

Mycroft is the name of a suite of software and hardware tools that use and to provide an open source voice assistant.

Mycroft components

Mycroft is modular. Some components can be easily 'swapped out' for others:

  • Wake Word detection

  • Speech to Text (STT)

  • Intent parser

Wake Word detection

There are two technologies that Mycroft.AI currently uses for Wake Word detection:

Because PocketSphinx is trained on English speech, your Wake Word currently needs to be an English word, like Hello Mike, Hi there Mickey or Hey Mike. Wake Words in other languages, like Spanish, French or German, won't work as well.

Speech to Text (STT)

Speech to Text (STT) software is used to take spoken words, and turn them into text phrases that can then be acted on.

Intent parser

An intent parser is software which identifies what the user's intent is based on their speech. An intent parser usually takes the output of a Speech to Text (STT) engine as an input.

For example, Julie Speaks the following to Mycroft: Hey Mycroft, tell me about the weather

Julie's intent is to find out about the weather (probably in her current location).

An intent parser can then match the intent with a suitable Skill to handle the intent.

Text to Speech

Text to Speech (TTS) software takes written text, such as text files on a computer, and uses a voice to speak the text. Text to Speech can have different voices, depending on the TTS engine used.

In your home.mycroft.ai account, you can select voices from these as well as

even more tts engines are available but require manual configuration.

Middleware

The Mycroft middleware has two components:

Mycroft Skills

Devices and Enclosures

Mycroft is designed to run on many different platforms. Each dedicated platform is called a device, these include:

  • Mark 1 - our first reference hardware device using a dedicated software image.

  • Mark 2 - our latest reference hardware device using a dedicated software image.

  • Picroft - any Raspberry Pi 3 or 4 that is running the Picroft software image.

The enclosure refers to the specific code that is required for that device. It might define unique functionality such as the eyes on the Mark 1, or a specific way of interacting with the hardware, such as controlling the volume levels at a hardware level via i2c.

A Wake Word is a phrase you use to tell Mycroft you're about to issue a command. By default, this is Hey Mycroft, but you can configure your own Wake Word in your account.

: PocketSphinx is part of the broader , developed by . PocketSphinx is a lightweight speech recognition engine, specifically tuned for handheld and mobile devices.

: Unlike PocketSphinx, which is based on Speech to Text technology, Precise is a neural network that is trained on audio data. It doesn't matter what words you want to use for your Wake Word. Instead, you train it on sounds. The downside is that Precise needs to be trained on your chosen Wake Word. Precise is the default Wake Word Listener for the "Hey Mycroft" wake word, PocketSphinx provides a fallback to this if Precise is unavailable.

We are working with Mozilla to build . A fully open source STT engine, based on Baidu’s Deep Speech architecture and implemented with Google’s framework.

DeepSpeech is not yet ready for production use and Mycroft currently uses as the default STT engine.

Mycroft also supports other STT engines that can be configured using the :

(IBM API key required)

(wit.ai API key required)

: Adapt is the default intent parser for all Mycroft platforms. Adapt was developed by Mycroft and is available under an open source license.

: Padatious is a neural network based intent parser. Padatious is currently under active development by Mycroft and is available under an open source license. It is likely that some Mycroft platforms will switch to using Padatious in the future instead of Adapt.

: Mycroft's default local text to speech (TTS) engine, based on CMU's Flite (Festival Lite)

: Mycroft's own cloud based text to speech (TTS) engine, based on Tacotron providing a much better voice quality.

: you need to choose which voice to use

: this code, written in Python, is the core software that provides the 'glue' between other modules. Mycroft Core is available under an Apache 2.0 open source license.

: this is the platform where data on Users and Devices is held. This platform provides abstraction services, such as storing API keys that are used to access third-party services to provide Skill functionality. The code for this platform is available under an AGPL 3.0 open source license.

are like 'add-ons' or 'plugins' that provide additional functionality. Skills can be developed by Mycroft Developers, or by Community Developers, and vary in their functionality and maturity.

is a Python-based utility that has been created to make it easier for Skill Authors to create, test and submit Skills to the .

is a command line tool used to add, manage and remove Skills on any Mycroft installation.

Mycroft Home
PocketSphinx
CMUSphinx package
Carnegie Mellon University
Precise
DeepSpeech
TensorFlow
Google STT
Configuration Manager
IBM Watson Speech to Text
wit.ai Speech to Text
Adapt intent parser
Padatious
Mimic
Mimic2
Google TTS
Mycroft Core
Mycroft Home and Mycroft API
Mycroft Skills
Mycroft Skills Kit (MSK)
Skills Marketplace
Mycroft Skills Kit
Mycroft Skills Manager (MSM)
natural language processing
machine learning
Our vision for Mycroft