Skip to main content

Speech-To-Text

tip

DeepGram is for VoiceWizardPro Only! Subscribe to the Patreon or Kofi to unlock it.

  • Convert Speech to Text to send through OSC (to VRChat or anywhere else)

  • Change the speech to text method from Settings > Audio > Speech to Text

image

  • Each of these methods require some sort of setup (except system speech). Click the name of the Speech-to-Text method to take you to its respective wiki page for more information.

STT Methods List

Speech-to-Text MethodDescriptionFree PricingContinuous
System SpeechThis method is the default and has the worst recognition quality. Although it can improved with training and editing the speech dictionaryUnlimitedyes
AzureGreat recognition quality without needing to sacrifice computational resources. Built in Translations5 speech recognition hours + 5 speech translation hours. This is actually much more than it seems when not using continuous recognition. (yes you can for example translate from English to English after your recognition hours run out for 10 total hours.)both
VoskOk recognition quality at the cost of computational resources (CPU and RAM). Can have higher recognition quality than Web Captioner depending on model used. (does not work on x86 version)Unlimitedyes
Web CaptionerOk recognition quality using "Web Speech API" through Web Captioner. Only available on Google Chrome. Multi-Language support.Unlimitedyes
WhisperAMAZING recognition quality at the cost of computational resources (GPU and RAM). Can have higher recognition accuracy than Azure depending on model used. (Experimental implementation) (does not work on x86 version)Unlimitedyes
DeepGramSimilar quality to Azure RecognitionOnly available with Voice Wizard Pro, limits vary with selected tierboth