Unified API for AI Tokens

DeepSeekDeepSeekDeepSeek-V4-pro

chatcodingmultilingualTokenWeb ready

DeepSeek-V4-Pro is DeepSeek's flagship MoE hybrid expert model, employing a highly efficient architecture with 1.6T total parameters and 49B activation parameters, and natively supporting an ultra-long context window of 1 million tokens. The model features a dedicated inference mode, achieving industry-leading performance in three core areas: complex logical reasoning, professional code generation, and intelligent agent execution. Its overall performance rivals that of top-tier closed-source models globally.

Input

$1.55 / 1M

Output

$3.1 / 1M

Cache write

$0.13 / 1M

Cache read

$0.13 / 1M

gpt-5.4

OpenAIgptgpt-5.4

chat，multilingualTokenWeb ready

GPT-5.4, released by OpenAI on March 5, 2026, is the first general-purpose large model with "native computer control capabilities." It integrates inference, encoding, and agent workflows, supports 1 million tokens, and can plan and execute long-term tasks.

Input

$5 / 1M

Output

$25 / 1M

Cache write

$0.25 / 1M

Cache read

$0.25 / 1M

qwen3.7-plus

QwenQwenqwen3.7-plus

chat，coding，multilingualToken

Qwen 3.7-Plus is a new generation of multimodal intelligent agent model officially released by Alibaba Cloud Tongyi Qianwen on June 1, 2026. It natively supports 1 million token contexts, unifies the processing of text, images, video, and screen input, and excels in GUI operation, visual encoding, and multi-tool collaboration. With a ScreenSpot Pro score of 79.0, it achieves a balance between performance and cost, making it an ideal choice for enterprise-level automated workflows.

Input

$0.8 / 1M

Output

$3 / 1M

Cache write

$0.06 / 1M

Cache read

$0.06 / 1M

gpt-5.5-pro

OpenAIgptgpt-5.5-pro

chat，coding，multilingualToken

GPT-5.5 pro leverages greater computing power to perform deeper thinking, thereby consistently delivering higher-quality answers. GPT-5.5 pro is designed to solve complex problems, some of which may take several minutes to complete.

Input

$55 / 1M

Output

$260 / 1M

Cache write

$5 / 1M

Cache read

$5 / 1M

claude-opus-4-8

AnthropicClaudeclaude-opus-4-8

chat，coding，multlingualToken

Claude Opus 4.8 is Anthropic's flagship multimodal understanding model, officially released on May 28, 2026. It leads the world with a SWE-Bench Pro score of 69.2%, features a new dynamic workflow that allows hundreds of agents to work in parallel, a 75% reduction in code defect rate, support for adjustable thought intensity and a 2.5x speed-up mode, and maintains the same price.

Input

$5 / 1M

Output

$25 / 1M

Cache write

$0.5 / 1M

Cache read

$0.5 / 1M

gemini-3.5-flash

GoogleGeminigemini-3.5-flash

gemini-3.5-flash is a high-performance and practical Google template that runs smoothly and is suitable for everyday Q&A, copywriting, and image/text analysis scenarios.

chat，multilingualToken

Input

$1.4 / 1M

Output

$8 / 1M

Cache write

Cache read

Kimi-K2.5

KimiKimiKimi-K2.5

chat，coding，multilingualToken

Kimi-K2.5 is the most versatile model released by Dark Side of the Moon to date, featuring a native multimodal architecture that supports both visual and text input, thinking and non-thinking modes, and dialogue and agent-based tasks.

Input

$0.5 / 1M

Output

$2.5 / 1M

Cache write

$0.05 / 1M

Cache read

$0.05 / 1M