Speech To Text step
Introduction
The "Speech to Text" step leverages OpenAI's capabilities to convert audio files into written text, utilizing the Whisper model for accurate transcription.
Configuration
API Token: Your OpenAI API token, which is necessary for accessing the speech-to-text service. This token must be valid and have the appropriate permissions.
Model: The specific model used for transcription, with "whisper-1" set as the default. OpenAI's Whisper models are designed for high accuracy in various languages and audio conditions.
File: The audio file to be transcribed. This file should contain clear audio of the spoken content you wish to convert into text.
Language: (Optional) The ISO-639-1 language code of the audio's language. Specifying the language can enhance the accuracy and efficiency of the transcription process.
Prompt: (Optional) A text prompt that can guide the model's understanding or continuation of the audio content. This is particularly useful for context continuation in multi-part audio.
Outputs
Text: The transcribed text obtained from the audio file. This output provides the spoken content in written form, ready for use in subsequent steps.
Last updated