Model marketplace
One market for leading AI models
Filter and sort
Open filters, series, modality, billing, and sort options
Sort
Filter and sort
Open filters, series, modality, billing, and sort options
gpt-5.4
TextGPT-5.4, released by OpenAI on March 5, 2026, is the first general-purpose large model with "native computer control capabilities." It integrates inference, encoding, and agent workflows, supports 1 million tokens, and can plan and execute long-term tasks.
Input
$5 / 1M
Output
$25 / 1M
Cache write
$0.25 / 1M
Cache read
$0.25 / 1M
gpt-image-2
ImageGPT-Image-2 is a next-generation image generation model launched by OpenAI in April 2026, officially named ChatGPT Images 2.0. It is positioned as OpenAI's first image model with native "thinking" reasoning capabilities, marking the evolution of AI drawing from a simple "rendering tool" to a "visual partner" capable of logical planning
Input
$7.5 / 1M
Output
$28 / 1M
Cache write
$1.8 / 1M
Cache read
$1.8 / 1M
qwen3.7-plus
TextQwen 3.7-Plus is a new generation of multimodal intelligent agent model officially released by Alibaba Cloud Tongyi Qianwen on June 1, 2026. It natively supports 1 million token contexts, unifies the processing of text, images, video, and screen input, and excels in GUI operation, visual encoding, and multi-tool collaboration. With a ScreenSpot Pro score of 79.0, it achieves a balance between performance and cost, making it an ideal choice for enterprise-level automated workflows.
Input
$0.8 / 1M
Output
$3 / 1M
Cache write
$0.06 / 1M
Cache read
$0.06 / 1M
gpt-5.5-pro
TextGPT-5.5 pro leverages greater computing power to perform deeper thinking, thereby consistently delivering higher-quality answers. GPT-5.5 pro is designed to solve complex problems, some of which may take several minutes to complete.
Input
$55 / 1M
Output
$260 / 1M
Cache write
$5 / 1M
Cache read
$5 / 1M
doubao-seedance-2-0-260128
VideoDoubao-seedance-2-0-260128 is the standard version (non-Fast) of the Seedance 2.0 video creation model family, developed by ByteDance's Seed team and supported by the Doubao and Volcano Engine platforms. This model employs a unified multimodal audio and video generation architecture, supporting text, images, audio, and video as input references. Through diffusion generation and multimodal alignment mechanisms, it achieves highly consistent video content generation. Compared to the Fast version, this model is optimized in terms of image quality, motion stability, detail consistency, and cue word adherence, making it more suitable for scenarios with high-quality generation requirements.
Input
$6.57 / 1M
Output
$6.57 / 1M
Cache write
Pricing pending
Cache read
Pricing pending
claude-opus-4-8
TextClaude Opus 4.8 is Anthropic's flagship multimodal understanding model, officially released on May 28, 2026. It leads the world with a SWE-Bench Pro score of 69.2%, features a new dynamic workflow that allows hundreds of agents to work in parallel, a 75% reduction in code defect rate, support for adjustable thought intensity and a 2.5x speed-up mode, and maintains the same price.
Input
$5 / 1M
Output
$25 / 1M
Cache write
$0.5 / 1M
Cache read
$0.5 / 1M
DeepSeek-V4-pro
TextDeepSeek-V4-Pro is DeepSeek's flagship MoE hybrid expert model, employing a highly efficient architecture with 1.6T total parameters and 49B activation parameters, and natively supporting an ultra-long context window of 1 million tokens. The model features a dedicated inference mode, achieving industry-leading performance in three core areas: complex logical reasoning, professional code generation, and intelligent agent execution. Its overall performance rivals that of top-tier closed-source models globally.
Input
$1.55 / 1M
Output
$3.1 / 1M
Cache write
$0.13 / 1M
Cache read
$0.13 / 1M
doubao-seedream-4-5-251128
ImageSeedream 4.5 is Doubao's latest multimodal image model, integrating text-to-image, image-to-image, and image fusion capabilities, while incorporating common sense and reasoning abilities. Compared to its predecessor, the 4.0 model, it boasts significantly improved generation quality, better editing consistency and multi-image fusion effects, more precise control over image details, more natural generation of small text and faces, more harmonious image layout and colors, and enhanced aesthetics.
Per request: $0.04