Skip to main content

Whisper

Implementation using WhisperNet a C# wrapper for whisper.cpp

note

Keep in mind that this implementation of Whisper uses your GPU.

The larger models may cause stuttering in a GPU intensive game like VRChat while in VR.

Requirements

  • Only supported platform is 64-bit Windows.
  • Should work on Windows 8.1 or newer, but I have only tested on Windows 10, Windows 11.
  • The library requires a Direct3D 11.0 capable GPU, which in 2023 simply means “any hardware GPU”. The most recent GPU without D3D 11.0 support was Intel Sandy Bridge from 2011.
  • On the CPU side, the library requires AVX1 and F16C support.
  • Essentially if you CPU and GPU are from 2011 or earlier support is not guarenteed

Switching Models

  • You can use the Auto Download Model dropdown to select a model and click the Download or Select button to download and or select the chosen model.

image

Manually Adding Models

  1. To get started using Whisper Download one of the models below or from the official whisper.cpp model list
Recommended Model DownloadSizeMemory
ggml-medium.bin1.5 GB~2.6 GB
ggml-small.bin466 MB~1.0 GB
ggml-base.bin142 MB~500 MB
ggml-tiny.bin75 MB~390 MB
  1. Add the model to Speech Provider > Local > Whisper.cpp Model (BIN file) image

Options

  • Min Duration minimum audio chunk length
  • Max Duration maximium audio chunk length

Tips

  • Noises that the Whisper AI recognizes are filtered out by default. It can recognize music, keyboard and mouse clicks etc.

  • If you notice that Whisper produces overlapping messages that play at the same time try using the Message Queue System found in the settings tab

  • Try using a noise filtering software like Nahmic to filter the background noise coming through your microphone

  • If your computer has 2 GPU (like most gaming laptops) then you should make sure that TTS Voice Wizard is using your high performance GPU for Nvidia (this can be set from the Nvidia Control Panel)

    Set GPU