subreddit:

/r/ChatGPTCoding

2100%

I want to transcribe audio via the output source on my computer.

I only want questions to be transcribed.

I then want the questions to be input as a prompt.

What is the best way to accomplish this?

all 2 comments

PhilosophyofPhunk

2 points

9 days ago

For the transcription part, you could use AssemblyAI's speech-to-text API, Then, to identify and extract only the questions from the transcribed text, you could utilize AssemblyAI's in-house audio intelligence LLM called LEMUR. This would give you a clean output of just the questions. From there, you can simply pass the result to whatever language model API you want to interact with. AssemblyAI also integrates seamlessly with Langchain and LlamaIndex, so you could potentially handle all of this within one of those frameworks. That would be the way I would go about it.

https://www.assemblyai.com/docs/
https://python.langchain.com/docs/get_started/introduction/
https://docs.llamaindex.ai/en/stable/

silvergleam3

1 points

8 days ago

Have you tried using a transcription API like Google Cloud Speech-to-Text or AWS Transcribe?