Free

DeepSeek R1 is here: Performance on par with OpenAI o1, but open-sourced and with fully open reasoning tokens. It's 671B parameters in size, with 37B active in an inference pass. Fully ...

DeepSeek 160K context $0 input tokens $0 output tokens

FREE

Meta: Llama 3.3 70B Instruct (free)

The Meta Llama 3.3 multilingual large language model (LLM) is a pretrained and instruction tuned generative model in 70B (text in/text out). The Llama 3.3 instruction tuned text only model is optimiz ...

Meta Llama 128K context $0 input tokens $0 output tokens

FREE

NVIDIA: Llama 3.1 Nemotron 70B Instruct (free)

NVIDIA's Llama 3.1 Nemotron 70B is a language model designed for generating precise and useful responses. Leveraging Llama 3.1 70B architecture and Reinfo ...

NVIDIA 128K context $0 input tokens $0 output tokens

FREE

Qwen: Qwen2.5 VL 72B Instruct (free)

Qwen2.5-VL is proficient in recognizing common objects such as flowers, birds, fish, and insects. It is also highly capable of analyzing texts, charts, icons, graphics, and layouts within images. ...

Qwen 128K context $0 input tokens $0 output tokens

FREE

Google: Gemini 2.0 Flash Thinking Experimental (free)

Gemini 2.0 Flash Thinking Mode is an experimental model that's trained to generate the "thinking process" the model goes through as part of its response. As a result, Thinking Mode is capable of stro ...

Google 39.06K context $0 input tokens $0 output tokens

FREE

Rogue Rose 103B v0.2 (free)

Rogue Rose demonstrates strong capabilities in roleplaying and storytelling applications, potentially surpassing other models in the 103-120B parameter range. While it occasionally exhibits inconsist ...

Sophosympatheia 4K context $0 input tokens $0 output tokens

FREE

Google: Gemini Pro 2.0 Experimental (free)

Gemini 2.0 Pro Experimental is a bleeding-edge version of the Gemini 2.0 Pro model. Because it's currently experimental, it will be heavily rate-limited by Google. Usage of Gemini is subject to ...

Google 1.91M context $0 input tokens $0 output tokens

FREE

Google: Gemini Flash Lite 2.0 Preview (free)

Gemini Flash Lite 2.0 offers a significantly faster time to first token (TTFT) compared to Gemini Flash 1.5, while maintaining quality on par with larger models like [Gemin ...

Google 976.56K context $0 input tokens $0 output tokens

FREE

DeepSeek: R1 Distill Llama 70B (free)

DeepSeek R1 Distill Llama 70B is a distilled large language model based on Llama-3.3-70B-Instruct, using outputs from DeepSeek R1. The m ...

DeepSeek 128K context $0 input tokens $0 output tokens

FREE

Qwen: Qwen VL Plus (free)

Qwen's Enhanced Large Visual Language Model. Significantly upgraded for detailed recognition capabilities and text recognition abilities, supporting ultra-high pixel resolutions up to millions of pix ...

Qwen 7.32K context $0 input tokens $0 output tokens

FREE

Google: Gemini 2.0 Flash Thinking Experimental (free)

Gemini 2.0 Flash Thinking Mode is an experimental model that's trained to generate the "thinking process" the model goes through as part of its response. As a result, Thinking Mode is capable of stro ...

Google 39.06K context $0 input tokens $0 output tokens

FREE

MythoMax 13B (free)

One of the highest performing and most popular fine-tunes of Llama 2 13B, with rich descriptions and roleplay. #merge _These are extended-context endpoints for [MythoMax 13B](/gryphe/mythomax-l2-13b ...

Gryphe 8K context $0 input tokens $0 output tokens

FREE

Toppy M 7B (free)

A wild 7B parameter model that merges several models using the new task_arithmetic merge method from mergekit. List of merged models:NousResearch/Nous-Capybara-7B-V1.9 [HuggingFaceH4/zephyr-7b-b...

Undi95 4K context $0 input tokens $0 output tokens

Nous: Hermes 3 405B Instruct (free)

Hermes 3 is a generalist language model with many improvements over Hermes 2, including advanced agentic capabilities, much better roleplaying, reasoning, multi-turn conversation, long context coher ...

NousreSearch 128K context $0 input tokens $0 output tokens

FREE

Qwen 2 7B Instruct (free)

Qwen2 7B is a transformer-based model that excels in language understanding, multilingual capabilities, coding, mathematics, and reasoning. It features SwiGLU activation, attention QKV bias, and gro ...

Qwen 32K context $0 input tokens $0 output tokens

FREE

Google: Gemma 2 9B (free)

Gemma 2 9B by Google is an advanced, open-source language model that sets a new standard for efficiency and performance in its size class. Designed for a wide variety of tasks, it empowers developer ...

Google 8K context $0 input tokens $0 output tokens

FREE

Google: Gemini Pro 1.5 Experimental

Google's latest multimodal model, supporting image and video in text or chat prompts. Optimized for language tasks including:Code generation Text generation Text editing Problem solving...

Google 1.91M context $0 input tokens $0 output tokens $0.003/M image tokens

FREE

Meta: Llama 3.2 11B Vision Instruct (free)

Llama 3.2 11B Vision is a multimodal model with 11 billion parameters, designed to handle tasks combining visual and textual data. It excels in tasks such as image captioning and visual question answ ...

Meta Llama 128K context $0 input tokens $0 output tokens $0.079/K image tokens

Liquid: LFM 40B MoE (free)

Liquid's 40.3B Mixture of Experts (MoE) model. Liquid Foundation Models (LFMs) are large neural networks built with computational units rooted in dynamic systems. LFMs are general-purpose AI models ...

Liquid 8K context $0 input tokens $0 output tokens

FREE

Meta: Llama 3.2 3B Instruct (free)

Llama 3.2 3B is a 3-billion-parameter multilingual large language model, optimized for advanced natural language processing tasks like dialogue generation, reasoning, and summarization. Designed with ...

Meta Llama 128K context $0 input tokens $0 output tokens

FREE

Meta: Llama 3.2 1B Instruct (free)

Llama 3.2 1B is a 1-billion-parameter language model focused on efficiently performing natural language tasks, such as summarization, dialogue, and multilingual text analysis. Its smaller size allows ...

Meta Llama 128K context $0 input tokens $0 output tokens

Google: Gemini Experimental 1121 (free)

Experimental release (November 21st, 2024) of Gemini. ...

Rifx.Online 8K context $0 input tokens $0 output tokens

Google: LearnLM 1.5 Pro Experimental (free)

An experimental version of Gemini 1.5 Pro from Google. ...

Rifx.Online 8K context $0 input tokens $0 output tokens

ERNIE-Speed-128K

Developer/Company: Baidu Research Key Capabilities & Use Cases: ERNIE-Speed-128K excels in rapid inference for real-time applications, leveraging enhanced semantic understandin ...

Ernie 128K context $0 input tokens $0 output tokens

ERNIE-Lite-8K:free

Key Capabilities and Use Cases:Designed for resource-constrained environments like mobile and edge devices. Applicable in smart assistants, voice recognition, and localized pro...

Ernie 8K context $0 input tokens $0 output tokens

ERNIE-Tiny-8K

Lightweight Chinese Pre-trained Language Model Developer/Company: Baidu Team Overview: ERNIE-Tiny-8K is a lightweight pre-trained language model designed for Chinese NLP tasks ...

Ernie 8K context $0 input tokens $0 output tokens

Meta: Llama 3.2 3B Instruct (free)

Llama 3.2 3B is a 3-billion-parameter multilingual large language model, optimized for advanced natural language processing tasks like dialogue generation, reasoning, and summarization. ...

Rifx.Online 4K context $0 input tokens $0 output tokens

Meta: Llama 3.2 90B Vision Instruct (free)

The Llama 90B Vision model is a top-tier, 90-billion-parameter multimodal model designed for the most challenging visual reasoning and language tasks. It offers unparalleled accuracy in ...

Rifx.Online 4K context $0 input tokens $0 output tokens

Meta: Llama 3.1 70B Instruct (free)

Meta's latest class of model (Llama 3.1) launched with a variety of sizes & flavors. This 70B instruct-tuned version is optimized for high quality dialogue usecases. It has demonstrate ...

Rifx.Online 8K context $0 input tokens $0 output tokens

Qwen 2 7B Instruct (free)

Qwen2 7B is a transformer-based model that excels in language understanding, multilingual capabilities, coding, mathematics, and reasoning. It features SwiGLU activation, attention QKV ...

Rifx.Online 8K context $0 input tokens $0 output tokens

Google: Gemma 2 9B (free)

Gemma 2 9B by Google is an advanced, open-source language model that sets a new standard for efficiency and performance in its size class. Designed for a wide variety of tasks, it empo ...

Rifx.Online 4K context $0 input tokens $0 output tokens

Mistral: Mistral 7B Instruct (free)

A high-performing, industry-standard 7.3B parameter model, with optimizations for speed and context length. *Mistral 7B Instruct has multiple version variants, and this is intended to ...

Rifx.Online 8K context $0 input tokens $0 output tokens

Phi-3 Mini 128K Instruct (free)

Phi-3 Mini is a powerful 3.8B parameter model designed for advanced language understanding, reasoning, and instruction following. Optimized through supervised fine-tuning and preference ...

Rifx.Online 8K context $0 input tokens $0 output tokens

Phi-3 Medium 128K Instruct (free)

Phi-3 128K Medium is a powerful 14-billion parameter model designed for advanced language understanding, reasoning, and instruction following. Optimized through supervised fine-tuning a ...

Rifx.Online 8K context $0 input tokens $0 output tokens

OpenChat 3.5 7B (free)

OpenChat 7B is a library of open-source language models, fine-tuned with "C-RLFT (Conditioned Reinforcement Learning Fine-Tuning)" - a strategy inspired by offline reinforcement learnin ...

Rifx.Online 8K context $0 input tokens $0 output tokens

Toppy M 7B (free)