Unified API for AI Tokens

DeepSeekDeepSeekDeepSeek-V4-pro

chatcodingmultilingualTokenWeb ready

DeepSeek-V4-Pro is DeepSeek's flagship MoE hybrid expert model, employing a highly efficient architecture with 1.6T total parameters and 49B activation parameters, and natively supporting an ultra-long context window of 1 million tokens. The model features a dedicated inference mode, achieving industry-leading performance in three core areas: complex logical reasoning, professional code generation, and intelligent agent execution. Its overall performance rivals that of top-tier closed-source models globally.

Input

$1.55 / 1M

Output

$3.1 / 1M

Cache write

$0.13 / 1M

Cache read

$0.13 / 1M

gpt-5.4

OpenAIgptgpt-5.4

chat，multilingualTokenWeb ready

GPT-5.4, released by OpenAI on March 5, 2026, is the first general-purpose large model with "native computer control capabilities." It integrates inference, encoding, and agent workflows, supports 1 million tokens, and can plan and execute long-term tasks.

Input

$5 / 1M

Output

$25 / 1M

Cache write

$0.25 / 1M

Cache read

$0.25 / 1M

gpt-image-2

Image

OpenAIgptgpt-image-2

GPT-Image-2 is a next-generation image generation model launched by OpenAI in April 2026, officially named ChatGPT Images 2.0. It is positioned as OpenAI's first image model with native "thinking" reasoning capabilities, marking the evolution of AI drawing from a simple "rendering tool" to a "visual partner" capable of logical planning

imageToken

Input

$7.5 / 1M

Output

$28 / 1M

Cache write

$1.8 / 1M

Cache read

$1.8 / 1M

qwen3.7-plus

QwenQwenqwen3.7-plus

chat，coding，multilingualToken

Qwen 3.7-Plus is a new generation of multimodal intelligent agent model officially released by Alibaba Cloud Tongyi Qianwen on June 1, 2026. It natively supports 1 million token contexts, unifies the processing of text, images, video, and screen input, and excels in GUI operation, visual encoding, and multi-tool collaboration. With a ScreenSpot Pro score of 79.0, it achieves a balance between performance and cost, making it an ideal choice for enterprise-level automated workflows.

Input

$0.8 / 1M

Output

$3 / 1M

Cache write

$0.06 / 1M

Cache read

$0.06 / 1M

gpt-5.5-pro

OpenAIgptgpt-5.5-pro

chat，coding，multilingualToken

GPT-5.5 pro leverages greater computing power to perform deeper thinking, thereby consistently delivering higher-quality answers. GPT-5.5 pro is designed to solve complex problems, some of which may take several minutes to complete.

Input

$55 / 1M

Output

$260 / 1M

Cache write

$5 / 1M

Cache read

$5 / 1M

DoubaoDoubaodoubao-seedance-2-0-260128

doubao-seedance-2-0-260128

Video

Doubao-seedance-2-0-260128 is the standard version (non-Fast) of the Seedance 2.0 video creation model family, developed by ByteDance's Seed team and supported by the Doubao and Volcano Engine platforms. This model employs a unified multimodal audio and video generation architecture, supporting text, images, audio, and video as input references. Through diffusion generation and multimodal alignment mechanisms, it achieves highly consistent video content generation. Compared to the Fast version, this model is optimized in terms of image quality, motion stability, detail consistency, and cue word adherence, making it more suitable for scenarios with high-quality generation requirements.

videoToken

Input

$6.57 / 1M

Output

$6.57 / 1M

Cache write

Cache read

claude-opus-4-8

AnthropicClaudeclaude-opus-4-8

chat，coding，multlingualToken

Claude Opus 4.8 is Anthropic's flagship multimodal understanding model, officially released on May 28, 2026. It leads the world with a SWE-Bench Pro score of 69.2%, features a new dynamic workflow that allows hundreds of agents to work in parallel, a 75% reduction in code defect rate, support for adjustable thought intensity and a 2.5x speed-up mode, and maintains the same price.

Input

$5 / 1M

Output

$25 / 1M

Cache write

$0.5 / 1M

Cache read

$0.5 / 1M

DoubaoDoubaodoubao-seedream-4-5-251128

doubao-seedream-4-5-251128

Image

Seedream 4.5 is Doubao's latest multimodal image model, integrating text-to-image, image-to-image, and image fusion capabilities, while incorporating common sense and reasoning abilities. Compared to its predecessor, the 4.0 model, it boasts significantly improved generation quality, better editing consistency and multi-image fusion effects, more precise control over image details, more natural generation of small text and faces, more harmonious image layout and colors, and enhanced aesthetics.

imagePer request

Per request: $0.04