Tools

List Models

Retrieve the full list of available AI/ML models across all categories (text, image, video, speech, embeddings, moderation, etc.). Use this to discover available model IDs before making generation requests.

Generate Embeddings

Generate vector embeddings from text for semantic search, similarity analysis, clustering, and classification. Supports models like text-embedding-3-small (1536 dimensions), text-embedding-3-large (3072 dimensions), and multilingual models. Can embed single strings or batches of text in a single request.

Speech to Text

Transcribe audio from a URL into text using speech-to-text models from OpenAI (Whisper), Deepgram (Nova-2), and Assembly AI. Submits the audio for asynchronous processing and polls until the transcription is ready. Generated transcriptions are stored on the server for 1 hour.

Generate Video

Generate videos from text prompts or reference images using video generation models like MiniMax and Kling AI. Video generation is asynchronous — the tool submits the request and polls for results. Supports text-to-video and image-to-video workflows.

Moderate Content

Classify text or image content as safe or unsafe using Meta's Llama Guard content moderation models. Analyzes input for harmful content and returns a safety classification with hazard categories when unsafe. Supports text, image URLs, and base64-encoded images.

Chat Completion

Generate text responses using 400+ LLM models including GPT, Claude, Gemini, DeepSeek, Llama, and Qwen. Supports system prompts, multi-turn conversations, temperature control, JSON mode, and web search. Use this for text generation, code generation, reasoning, question answering, and conversational AI.

Text to Speech

Convert text into natural-sounding speech audio using models from OpenAI, ElevenLabs, Deepgram, and Microsoft. Supports 120+ languages, multiple voices, adjustable speed, and various audio formats (mp3, opus, aac, flac, wav, pcm). Returns a URL to the generated audio file.

Generate Image

Generate images from text prompts using 70+ image models including Flux, DALL-E, Stable Diffusion, Imagen, and more. Supports configurable resolution, aspect ratio, negative prompts, guidance scale, and seed for reproducibility. Returns URLs to the generated images.