Built by Metorial, the integration platform for agentic AI.
List transcripts with pagination and optional filters. Returns transcript summaries sorted from newest to oldest. Supports filtering by status and creation date, and cursor-based pagination using before/after IDs.
Retrieve a completed transcript's text segmented into sentences or paragraphs. The API semantically segments the text for more reader-friendly output. Choose "sentences" or "paragraphs" segmentation depending on how granular you need the output.
Export a completed transcript as SRT or VTT subtitle format for use with video players for subtitles and closed captions. Optionally limit the number of characters per caption line.
Search through a completed transcript for specific keywords. You can search for individual words, numbers, or phrases of up to five words. Returns match counts and timestamps for each keyword found.
Delete a transcript by removing its data and marking it as deleted. The transcript resource itself remains but its data is permanently removed. Any files uploaded via the upload endpoint are also immediately deleted alongside the transcript.
Submit an audio or video file for asynchronous transcription. Provide a publicly accessible URL to the media file. Optionally enable audio intelligence features like summarization, sentiment analysis, entity detection, topic detection, content moderation, key phrases, auto chapters, and PII redaction. Returns the transcript object with a status of "queued" — poll using the **Get Transcript** tool to check for completion.
Retrieve a transcript by its ID. Returns the full transcript object including text, words with timestamps, speaker labels, and any enabled audio intelligence results (summary, sentiment, entities, topics, chapters, content safety, key phrases). Use this to poll for completion after submitting a transcription, or to retrieve results of a completed transcript.
Generate a temporary authentication token for use with AssemblyAI's real-time streaming speech-to-text WebSocket API. Use this to securely authenticate client-side streaming without exposing your main API key. Each token is single-use and valid for one streaming session.
Apply a large language model to one or more transcripts using AssemblyAI's LeMUR framework. Submit a custom prompt along with transcript IDs or raw text input, and receive an LLM-generated response. Use this for summarizing transcripts, extracting insights, answering questions about audio content, generating action items, or any custom analysis task. Supports multiple LLM providers including Claude, GPT, and Gemini models.
Retrieve the URL for a PII-redacted audio file. The original transcription must have been submitted with PII audio redaction enabled (`redactPiiAudio: true`). The redacted audio has personally identifiable information "beeped" out.