Type something to search...

Models

Qwen3-VL-30B-A3B-Thinking is a multimodal model that unifies strong text generation with visual understanding for images and videos. Its Thinking variant enhances reasoning in STEM, math, and complex ...

Qwen: Qwen3 VL 30B A3B Thinking
Qwen
256K context $0.3/M input tokens $1/M output tokens

Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family, optimized for ultra-low latency and cost efficiency. It offers improved throughput, faster token generation, and bette ...

Google: Gemini 2.5 Flash Lite
Google
1M context $0.1/M input tokens $0.4/M output tokens

GPT-5 is OpenAI’s most advanced model, offering major improvements in reasoning, code quality, and user experience. It is optimized for complex tasks that require step-by-step reasoning, instruction ...

OpenAI: GPT-5
OpenAI
390.63K context $1.25/M input tokens $10/M output tokens

Claude Sonnet 4.5 is Anthropic’s most advanced Sonnet model to date, optimized for real-world agents and coding workflows. It delivers state-of-the-art performance on coding benchmarks such as SWE-be ...

Anthropic: Claude Sonnet 4.5
Anthropic
976.56K context $3/M input tokens $15/M output tokens

DeepSeek-V3.2-Exp is an experimental large language model released by DeepSeek as an intermediate step between V3.1 and future architectures. It introduces DeepSeek Sparse Attention (DSA), a fine-gra ...

DeepSeek: DeepSeek V3.2 Exp
DeepSeek
160K context $0.27/M input tokens $0.4/M output tokens
FREE

DeepSeek-V3.1 is a large hybrid reasoning model (671B parameters, 37B active) that supports both thinking and non-thinking modes via prompt templates. It extends the DeepSeek-V3 base with a two-phase ...

DeepSeek: DeepSeek V3.1 (free)
DeepSeek
159.96K context $0 input tokens $0 output tokens

Gemini 2.5 Flash Image Preview, a.k.a. "Nano Banana," is a state of the art image generation model with contextual understanding. It is capable of image generation, edits, and multi-turn conversation ...

Google: Gemini 2.5 Flash Image Preview (Nano Banana)
Google
32K context $0.3/M input tokens $2.5/M output tokens $0.001/M image tokens

Qwen3-Coder-480B-A35B-Instruct is a Mixture-of-Experts (MoE) code generation model developed by the Qwen team. It is optimized for agentic coding tasks such as function calling, tool use, and long-co ...

Qwen: Qwen3 Coder 480B A35B
Qwen
256K context $0.22/M input tokens $0.95/M output tokens

Qwen3-Max is an updated release built on the Qwen3 series, offering major improvements in reasoning, instruction following, multilingual support, and long-tail knowledge coverage compared to the Janu ...

Qwen: Qwen3 Max
Qwen
250K context $1.2/M input tokens $6/M output tokens

GPT-5-Nano is the smallest and fastest variant in the GPT-5 system, optimized for developer tools, rapid interactions, and ultra-low latency environments. While limited in reasoning depth compared to ...

OpenAI: GPT-5 Nano
OpenAI
390.63K context $0.05/M input tokens $0.4/M output tokens

GPT-5 Mini is a compact version of GPT-5, designed to handle lighter-weight reasoning tasks. It provides the same instruction-following and safety-tuning benefits as GPT-5, but with reduced latency a ...

OpenAI: GPT-5 Mini
OpenAI
390.63K context $0.25/M input tokens $2/M output tokens

Llama 4 Scout 17B Instruct (16E) is a mixture-of-experts (MoE) language model developed by Meta, activating 17 billion parameters out of a total of 109B. It supports native multimodal input (text and ...

Meta: Llama 4 Scout
Rifx.Online
320K context $0.08/M input tokens $0.3/M output tokens $0.334/K image tokens

Llama 4 Maverick 17B Instruct (128E) is a high-capacity multimodal language model from Meta, built on a mixture-of-experts (MoE) architecture with 128 experts and 17 billion active parameters per for ...

Meta: Llama 4 Maverick
Rifx.Online
1M context $0.17/M input tokens $0.85/M output tokens $0.668/K image tokens

Llama-3.3-Nemotron-Super-49B-v1.5 is a 49B-parameter, English-centric reasoning/chat model derived from Meta’s Llama-3.3-70B-Instruct with a 128K context. It’s post-trained for agentic workflows (RAG ...

NVIDIA: Llama 3.3 Nemotron Super 49B V1.5
NVIDIA
128K context $0.1/M input tokens $0.4/M output tokens

Llama Guard 4 is a Llama 4 Scout-derived multimodal pretrained model, fine-tuned for content safety classification. Similar to previous versions, it can be used to classify content in both LLM inputs ...

Meta: Llama Guard 4 12B
Rifx.Online
160K context $0.05/M input tokens $0.05/M output tokens
Tags
Type something to search...