LLM API
Audio transcription
Convert spoken audio into text with whisper-large-v3-turbo.
Transcribe a file
Unlike the JSON endpoints, transcription takes a multipart form upload. Send the audio as file and the model as whisper-large-v3-turbo.
with open("meeting.mp3", "rb") as f:
resp = client.audio.transcriptions.create(
model="whisper-large-v3-turbo",
file=f,
)
print(resp.text)Response formats
By default you get JSON with a text field. Use response_format to choose another shape:
json—{ "text": "…" }(default).text— plain text, no envelope.verbose_json— text plus segment metadata.srt/vtt— subtitle files.
curl https://llm.upgreat.ai/v1/audio/transcriptions \
-H "Authorization: Bearer $UPGREAT_API_KEY" \
-F "model=whisper-large-v3-turbo" \
-F "file=@talk.wav" \
-F "response_format=srt"Word and segment timings
Passresponse_format=verbose_json together with timestamp_granularities[]=word (or segment) to get timestamps you can use to align captions or build a transcript player.