Models

Grok 4 Fast is xAI's latest multimodal model with SOTA cost-efficiency and a 2M token context window. It comes in two flavors: non-reasoning and reasoning. Read more about the model on xAI's [news po ...

X AI 1.91M context $0.2/M input tokens $0.5/M output tokens

xAI: Grok 4

Grok 4 is xAI's latest reasoning model with a 256k context window. It supports parallel tool calling, structured outputs, and both image and text inputs. Note that reasoning is not exposed, reasoning ...

X AI 250K context $3/M input tokens $15/M output tokens

MoonshotAI: Kimi K2

Kimi K2 Instruct is a large-scale Mixture-of-Experts (MoE) language model developed by Moonshot AI, featuring 1 trillion total parameters with 32 billion active per forward pass. It is optimized for ...

Rifx.Online 64K context $0.14/M input tokens $2.49/M output tokens

DeepSeek: DeepSeek V3 0324

DeepSeek V3, a 685B-parameter, mixture-of-experts model, is the latest iteration of the flagship chat model family from the DeepSeek team. It succeeds the DeepSeek V3 m ...

DeepSeek 62.5K context $0.27/M input tokens $1.1/M output tokens

FREE

MoonshotAI: Kimi K2 (free)

Kimi K2 Instruct is a large-scale Mixture-of-Experts (MoE) language model developed by Moonshot AI, featuring 1 trillion total parameters with 32 billion active per forward pass. It is optimized for ...

Rifx.Online 64K context $0 input tokens $0 output tokens

FREE

DeepSeek: R1 0528 (free)

DeepSeek-R1 1. Introduction We introduce our first-generation reasoning models, DeepSeek-R1-Zero and DeepSeek-R1. DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (R ...

DeepSeek 160K context $0 input tokens $0 output tokens

FREE

tts-1-1106

...

Rifx.Online $0 input tokens $0 output tokens $0.01/M request tokens

FREE

FunAudioLLM/CosyVoice2-0.5B

...

Rifx.Online $0 input tokens $0 output tokens $0.01/M request tokens

FREE

FunAudioLLM/SenseVoiceSmall

FunAudioLLM/SenseVoiceSmall ...

Rifx.Online $0 input tokens $0 output tokens

gpt-4.1-mini

GPT-4.1 Mini is a mid-sized model delivering performance competitive with GPT-4o at substantially lower latency and cost. It retains a 1 million token context window and scores 45.1% on hard instruct ...

OpenAI 1023.02K context $0.4/M input tokens $1.6/M output tokens

gpt-4.1-mini

GPT-4.1 Mini is a mid-sized model delivering performance competitive with GPT-4o at substantially lower latency and cost. It retains a 1 million token context window and scores 45.1% on hard instruct ...

OpenAI 1023.02K context $0.4/M input tokens $1.6/M output tokens

free/gpt-4.1-nano

For tasks that demand low latency, GPT‑4.1 nano is the fastest and cheapest model in the GPT-4.1 series. It delivers exceptional performance at a small size with its 1 million token context window, a ...

OpenAI 1023.02K context $0.1/M input tokens $0.4/M output tokens

gpt-4.1