file (text): The audio file object (not a filename) to translate, in the following formats: flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, or webm.
model (text): The model ID to use. Currently, only whisper-1 (provided by our open-source Whisper V2 model) is available.
prompt (text): Optional text used to guide the model's style or to continue the previous audio clip. Prompts should be in English.
response_format (text): The output format, available as json, text, srt, verbose_json, or vtt.
temperature (text): The sample temperature, between 0 and 1. Higher values (e.g., 0.8) produce more random output, while lower values (e.g., 0.2) produce more focused, deterministic output. If set to 0, the model will automatically increase the temperature using log-probability until a certain threshold is reached.