opencode icon indicating copy to clipboard operation
opencode copied to clipboard

[FEATURE]: Speech-to-Text Voice Input for Lazy People in OpenCode

Open Fuzu opened this issue 2 months ago • 6 comments

Feature hasn't been suggested before.

  • [x] I have verified this feature I'm about to request hasn't been suggested before.

Describe the enhancement you want to request

Hi! First of all, congratulations on the amazing project.

I've been working on a Speech-to-Text voice input feature that integrates directly into the TUI. It allows users to start audio recording with a keybind, automatically transcribe speech using different providers, and insert the resulting text directly into the prompt.

I've built an initial working version, currently tested only on macOS, and the system includes:

  • Real-time audio recording via FFmpeg;
  • Support for Groq Whisper, OpenAI Whisper, and local whisper.cpp;
  • Automatic microphone/device detection;
  • Interactive menus for choosing provider, model, and audio device;
  • Persistent configuration stored in ~/.opencode/state/speech.json;
  • Customizable keybinds (Ctrl+X v, Ctrl+X P, Ctrl+X D);
  • Smooth flow: record → transcribe → insert into prompt input;
Image Image Image Image Image

Would this be something you'd be interested in integrating into the project?

Fuzu avatar Nov 24 '25 18:11 Fuzu

That sounds cool, did you integrate it using the plugin system, and if not why? Maybe we need to expand it to allow for stuff like this

rekram1-node avatar Nov 25 '25 04:11 rekram1-node

This ticket inspired me to create this: https://github.com/chuckstack/groq-whisper

This is not nearly as good or integrated as what is described above; however I could not wait for the above to be accepted. I thought you would appreciate seeing a generic tools approach to implementing it.

You can use it from:

  • opencode: !groq-whisper # this injects the response directly into the context without the ability to edit prior to injection
  • vim: :r groq-whisper # this allows you to ctrl+p => open editor (vim) and capture the text before submitting
  • terminal: groq-whisper # can be used outside of opencode

I hope this helps!

edit to above details:

  • only tested on debian (see notes for mac)
  • only uses groq whisper

cboecking avatar Nov 25 '25 18:11 cboecking

would be really awesome !

sfpmld avatar Dec 29 '25 13:12 sfpmld

I was just looking for something like this today, it'd be awesome to see it implemented!

nanoandrew4 avatar Jan 12 '26 00:01 nanoandrew4

bump for interest!

0x7C2f avatar Jan 15 '26 20:01 0x7C2f

Hey! Same here! I'd love it!

KristjanMinn avatar Jan 17 '26 07:01 KristjanMinn

hi, checkout another implantation #9264 support both whisper model and audio large language model (gpt-4o, qwen3-omni, etc.)

heimoshuiyu avatar Jan 18 '26 18:01 heimoshuiyu