Type something to search...

Models

A wild 7B parameter model that merges several models using the new task_arithmetic merge method from mergekit. List of merged models:NousResearch/Nous-Capybara-7B-V1.9 [HuggingFaceH4/zephyr-7b-b...

Toppy M 7B
Undi95
4K context $0.07/M input tokens $0.07/M output tokens

A recreation trial of the original MythoMax-L2-B13 but with updated models. #merge ...

ReMM SLERP 13B
Undi95
4K context $1.125/M input tokens $1.125/M output tokens

The first image to text model from Mistral AI. Its weight was launched via torrent per their tradition: https://x.com/mistralai/status/1833758285167722836 ...

Mistral: Pixtral 12B
MistralAI
4K context $0.1/M input tokens $0.1/M output tokens $0.144/K image tokens

Phi-3.5 models are lightweight, state-of-the-art open models. These models were trained with Phi-3 datasets that include both synthetic data and the filtered, publicly available websites data, with a ...

Phi-3.5 Mini 128K Instruct
Microsoft Azure
125K context $0.1/M input tokens $0.1/M output tokens

Dynamic model continuously updated to the current version of GPT-4o in ChatGPT. Intended for research and evaluation. Note: This model is currently experimental and not suitable fo ...

OpenAI: ChatGPT-4o
OpenAI
125K context $5/M input tokens $15/M output tokens $0.007/M image tokens

Claude 3.5 Sonnet delivers better-than-Opus capabilities, faster-than-Sonnet speeds, at the same Sonnet prices. Sonnet is particularly good at:Coding: Autonomously writes, edits, and runs code wi...

Anthropic: Claude 3.5 Sonnet (2024-06-20)
Anthropic
195.31K context $3/M input tokens $15/M output tokens $0.005/M image tokens

Llama 3.2 1B is a 1-billion-parameter language model focused on efficiently performing natural language tasks, such as summarization, dialogue, and multilingual text analysis. Its smaller size allows ...

Meta: Llama 3.2 1B Instruct
Meta Llama
128K context $0.01/M input tokens $0.02/M output tokens

Introduction QwQ-32B-Preview is an experimental research model developed by the Qwen Team, focused on advancing AI reasoning capabilities. As a preview release, it demonstrates promising anal ...

Qwen: QwQ 32B Preview
Qwen
32K context $0.15/M input tokens $0.6/M output tokens

Gemini 2.0 Flash offers a significantly faster time to first token (TTFT) compared to Gemini 1.5 Flash, while maintaining quality on par with larger models like [Gemini 1.5 ...

Google: Gemini 2.0 Flash Experimental
Google
976.56K context $0.2/M input tokens $0.6/M output tokens
FREE

Gemini 2.0 Flash offers a significantly faster time to first token (TTFT) compared to Gemini 1.5 Flash, while maintaining quality on par with larger models like [Gemini 1.5 ...

Google: Gemini 2.0 Flash Experimental (free)
Google
976.56K context $0 input tokens $0 output tokens
FREE

DeepSeek R1 is here: Performance on par with OpenAI o1, but open-sourced and with fully open reasoning tokens. It's 671B parameters in size, with 37B active in an inference pass. Fully ...

DeepSeek: R1 (free)
DeepSeek
160K context $0 input tokens $0 output tokens
FREE

The Meta Llama 3.3 multilingual large language model (LLM) is a pretrained and instruction tuned generative model in 70B (text in/text out). The Llama 3.3 instruction tuned text only model is optimiz ...

Meta: Llama 3.3 70B Instruct (free)
Meta Llama
128K context $0 input tokens $0 output tokens
FREE

NVIDIA's Llama 3.1 Nemotron 70B is a language model designed for generating precise and useful responses. Leveraging Llama 3.1 70B architecture and Reinfo ...

NVIDIA: Llama 3.1 Nemotron 70B Instruct (free)
NVIDIA
128K context $0 input tokens $0 output tokens
FREE

Qwen2.5-VL is proficient in recognizing common objects such as flowers, birds, fish, and insects. It is also highly capable of analyzing texts, charts, icons, graphics, and layouts within images. ...

Qwen: Qwen2.5 VL 72B Instruct (free)
Qwen
128K context $0 input tokens $0 output tokens
FREE

Gemini 2.0 Flash Thinking Mode is an experimental model that's trained to generate the "thinking process" the model goes through as part of its response. As a result, Thinking Mode is capable of stro ...

Google: Gemini 2.0 Flash Thinking Experimental (free)
Google
39.06K context $0 input tokens $0 output tokens
Tags
Type something to search...