Chatbots

Google: Gemini 2.0 Flash Experimental

Gemini 2.0 Flash offers a significantly faster time to first token (TTFT) compared to Gemini 1.5 Flash, while maintaining quality on par with larger models like [Gemini 1.5 ...

Google 976.56K context $0.2/M input tokens $0.6/M output tokens

My 5 AI Predictions for 2025

My 5 AI Predictions for 2025

Rifx.Online
Autonomous Systems , Chatbots , Predictive Analytics
26 Dec, 2024

And a few non-predictions Predicting (correctly) the future is challenging. Ask about it — to bring a widely known pop culture icon — Hanna and Barbera, the creators of the Jetsons, who imagi

Inflatebot: Mag Mell R1 12B

Mag Mell is a merge of pre-trained language models created using mergekit, based on Mistral Nemo. It is a great roleplay and storytelling model which combines the best part ...

Rifx.Online 15.63K context $0.9/M input tokens $0.9/M output tokens

FREE

Google: Gemini 2.0 Flash Thinking Experimental (free)

Text image 2 text

Gemini 2.0 Flash Thinking Mode is an experimental model that's trained to generate the "thinking process" the model goes through as part of its response. As a result, Thinking Mode is capable of stro ...

Google 39.06K context $0 input tokens $0 output tokens

50% OFF

EVA Llama 3.33 70b

EVA Llama 3.33 70b is a roleplay and storywriting specialist model. It is a full-parameter finetune of Llama-3.3-70B-Instruct on mixture of ...

Eva unit 01 16K context $4/M input tokens $6/M output tokens

Sao10K: Llama 3.3 Euryale 70B

Euryale L3.3 70B is a model focused on creative roleplay from Sao10k. It is the successor of Euryale L3 70B v2.2. ...

Rifx.Online 7.81K context $1.5/M input tokens $1.5/M output tokens

What are AI Agents: From Virtual Assistants to Intelligent Decision-Makers

What are AI Agents: From Virtual Assistants to Intelligent Decision-Makers

Rifx.Online
Chatbots , Autonomous Systems , Machine Learning
15 Dec, 2024

A Ground-Up Guide to Understanding AI Agents The recent shift from LLM-powered chatbots to what the field now defines as agentic systems or agentic AI can be summarized with a good old s

70% OFF

nova-micro

Amazon Nova Micro 1.0 is a text-only model that delivers the lowest latency responses in the Amazon Nova family of models at a very low cost. With a context length of 128K tokens and optimized for s ...

Amazon 125K context $0.03/M input tokens $0.14/M output tokens $0.053/K image tokens

gemini-exp-1206

Text image 2 text

Experimental release (December 6, 2024) of Gemini. ...

Google 8K context $4/M input tokens $16/M output tokens

Meta: Llama 3.3 70B Instruct

The Meta Llama 3.3 multilingual large language model (LLM) is a pretrained and instruction tuned generative model in 70B (text in/text out). The Llama 3.3 instruction tuned text only model is optimiz ...

Meta Llama 128K context $0.13/M input tokens $0.4/M output tokens

Magnum v4 72B

This is a series of models designed to replicate the prose quality of the Claude 3 models, specifically Sonnet(https://openrouter.ai/anthropic/claude-3.5-sonnet) and Opus(https://openrouter.ai/anthro ...

Anthracite org 32K context $1.875/M input tokens $2.25/M output tokens

baichuan3-turbo

Baichuan3-Turbo is an advanced artificial intelligence language model designed to provide users with efficient and intelligent natural language processing solutions. Leveraging the latest deep learni ...

Baichuan 31.25K context $1.7/M input tokens $1.7/M output tokens

baichuan4

Baichuan4 Model Introduction Baichuan4 is a state-of-the-art artificial intelligence language model designed to enhance natural language understanding and generation capabilities. Built on cutti ...

Baichuan 31.25K context $14.3/M input tokens $14.3/M output tokens

moonshot-v1-8k

Moonshot-v1-8k Model Introduction Moonshot-v1-8k is a large-scale language model developed by Moonshot AI, known for its exceptional natural language processing capabilities. Utilizing advanced ...

Moonshot 7.81K context $1.9/M input tokens $1.9/M output tokens

Amazon: Nova Micro 1.0

Amazon Nova Micro 1.0 is a text-only model that delivers the lowest latency responses in the Amazon Nova family of models at a very low cost. With a context length of 128K tokens and optimized for sp ...

Amazon 125K context $0.035/M input tokens $0.14/M output tokens $0.053/K image tokens

40% OFF

Claude-3-Haiku-20240307

Text image 2 text

Claude 3 Haiku is Anthropic's fastest and most compact model for near-instant responsiveness. Quick and accurate targeted performance. See the launch announcement and benchmark results [here](https ...

Anthropic 195.31K context $0.5/M input tokens $2.5/M output tokens $0.4/K image tokens

Exploring generative AI suggestions for analogous data color schemes

Exploring generative AI suggestions for analogous data color schemes

Rifx.Online
Generative AI , Color Vision , Data Science
05 Dec, 2024

Analogous color harmonies refer to at least three colors that are adjacent to each other on the color wheel. The color scheme can create pleasing color combinations but runs the risk of failing

GPT-4o mini

Text image 2 text

GPT-4o mini is OpenAI's newest model after GPT-4 Omni, supporting both text and image inputs with text outputs. As their most advanced small model, it is many multiples more afford ...

OpenAI 125K context $0.15/M input tokens $0.6/M output tokens $0.007/M image tokens

40% OFF

GPT-4o mini

Text image 2 text

# Discount # 40%Off # Discount

GPT-4o mini is OpenAI's newest model after GPT-4 Omni, supporting both text and image inputs with text outputs. As their most advanced small model, it is many multiples more afford ...

OpenAI 125K context $0.15/M input tokens $0.6/M output tokens $0.007/M image tokens

40% OFF

Claude 3.5 Sonnet-20240620

Text image 2 text

# Discount # 40%Off # Discount

Claude 3.5 Sonnet delivers better-than-Opus capabilities, faster-than-Sonnet speeds, at the same Sonnet prices. Sonnet is particularly good at:Coding: Autonomously writes, edits, and runs code w...

Anthropic 195.31K context $3/M input tokens $15/M output tokens $0.005/M image tokens

MythoMax 13B (extended)

One of the highest performing and most popular fine-tunes of Llama 2 13B, with rich descriptions and roleplay. #merge _These are extended-context endpoints for [MythoMax 13B](/gryphe/mythomax-l2-13b ...

Gryphe 8K context $1.125/M input tokens $1.125/M output tokens

FREE

MythoMax 13B (free)

One of the highest performing and most popular fine-tunes of Llama 2 13B, with rich descriptions and roleplay. #merge _These are extended-context endpoints for [MythoMax 13B](/gryphe/mythomax-l2-13b ...

Gryphe 8K context $0 input tokens $0 output tokens

Google: PaLM 2 Code Chat 32k

PaLM 2 fine-tuned for chatbot conversations that help with code-related questions. ...

Google 31.99K context $1/M input tokens $2/M output tokens

01.AI: Yi Large

The Yi Large model was designed by 01.AI with the following usecases in mind: knowledge search, data classification, human-like chat bots, and customer service. It stands out for its multilingual pr ...

01 ai 32K context $3/M input tokens $3/M output tokens

Mistral Large 2411

This is Mistral AI's flagship model, Mistral Large 2 (version mistral-large-2407). It's a proprietary weights-available model and excels at reasoning, code, JSON, chat, and more. Read the launch anno ...

MistralAI 125K context $2/M input tokens $6/M output tokens

Mistral Large 2407

This is Mistral AI's flagship model, Mistral Large 2 (version mistral-large-2407). It's a proprietary weights-available model and excels at reasoning, code, JSON, chat, and more. Read the launch anno ...

MistralAI 125K context $2/M input tokens $6/M output tokens

Perplexity: Llama 3.1 Sonar 70B

Llama 3.1 Sonar is Perplexity's latest model family. It surpasses their earlier Sonar models in cost-efficiency, speed, and performance. This is a normal offline LLM, but the [online version](/perpl ...

Perplexity 128K context $1/M input tokens $1/M output tokens

Perplexity: Llama 3.1 Sonar 8B

Llama 3.1 Sonar is Perplexity's latest model family. It surpasses their earlier Sonar models in cost-efficiency, speed, and performance. This is a normal offline LLM, but the [online version](/perpl ...

Perplexity 128K context $0.2/M input tokens $0.2/M output tokens

OpenAI: GPT-3.5 Turbo 16k (older v1106)

An older GPT-3.5 Turbo model with improved instruction following, JSON mode, reproducible outputs, parallel function calling, and more. Training data: up to Sep 2021. ...

OpenAI 16K context $1/M input tokens $2/M output tokens

Meta: LlamaGuard 2 8B

This safeguard model has 8B parameters and is based on the Llama 3 family. Just like is predecessor, LlamaGuard 1, it can do both prompt and respons ...

Meta Llama 8K context $0.18/M input tokens $0.18/M output tokens

Mistral Small

Cost-efficient, fast, and reliable option for use cases such as translation, summarization, and sentiment analysis. ...

MistralAI 31.25K context $0.2/M input tokens $0.6/M output tokens

Mistral Tiny

This model is currently powered by Mistral-7B-v0.2, and incorporates a "better" fine-tuning than Mistral 7B, inspired by community work. It's best used for larg ...

MistralAI 31.25K context $0.25/M input tokens $0.25/M output tokens

Google: Gemini Pro 1.0

Google's flagship text generation model. Designed to handle natural language tasks, multiturn text and code chat, and code generation. See the benchmarks and prompting guidelines from [Deepmind](htt ...

Google 31.99K context $0.5/M input tokens $1.5/M output tokens $0.003/M image tokens

Goliath 120B

A large LLM created by combining two fine-tuned Llama 70B models into one 120B model. Combines Xwin and Euryale. Credits to@chargoddard for developing the fr...

Alpindale 6K context $9.375/M input tokens $9.375/M output tokens

Nous: Hermes 3 405B Instruct (free)

Hermes 3 is a generalist language model with many improvements over Hermes 2, including advanced agentic capabilities, much better roleplaying, reasoning, multi-turn conversation, long context coher ...

NousreSearch 128K context $0 input tokens $0 output tokens

WizardLM-2 7B

WizardLM-2 7B is the smaller variant of Microsoft AI's latest Wizard model. It is the fastest and achieves comparable performance with existing 10x larger opensource leading models It is a finetune ...

Microsoft Azure 31.25K context $0.055/M input tokens $0.055/M output tokens

AI21: Jamba Instruct

The Jamba-Instruct model, introduced by AI21 Labs, is an instruction-tuned variant of their hybrid SSM-Transformer Jamba model, specifically optimized for enterprise applications.256K Context Win...

Ai21 250K context $0.5/M input tokens $0.7/M output tokens

Llama 3 Euryale 70B v2.1

Euryale 70B v2.1 is a model focused on creative roleplay from Sao10k.Better prompt adherence. Better anatomy / spatial awareness. Adapts much better to unique and...

Rifx.Online 8K context $0.35/M input tokens $0.4/M output tokens

Cohere: Command

Command is an instruction-following conversational model that performs language tasks with high quality, more reliably and with a longer context than our base generative models. Use of this model is ...

Cohere 4K context $0.95/M input tokens $1.9/M output tokens

Cohere: Command R

Command-R is a 35B parameter model that performs conversational language tasks at a higher quality, more reliably, and with a longer context than previous models. It can be used for complex workflows ...

Cohere 125K context $0.475/M input tokens $1.425/M output tokens

Google: Gemma 2 27B

Gemma 2 27B by Google is an open model built from the same research and technology used to create the Gemini models. Gemma models are well-suited for a variety of text generation ...

Google 8K context $0.27/M input tokens $0.27/M output tokens

OpenAI: ChatGPT-4o

Text image 2 text

Dynamic model continuously updated to the current version of GPT-4o in ChatGPT. Intended for research and evaluation. Note: This model is currently experimental and not suitable fo ...

OpenAI 125K context $5/M input tokens $15/M output tokens $0.007/M image tokens

Anthropic: Claude 3.5 Sonnet (2024-06-20)

Text image 2 text

Claude 3.5 Sonnet delivers better-than-Opus capabilities, faster-than-Sonnet speeds, at the same Sonnet prices. Sonnet is particularly good at:Coding: Autonomously writes, edits, and runs code wi...

Anthropic 195.31K context $3/M input tokens $15/M output tokens $0.005/M image tokens

Anthropic: Claude 3.5 Haiku (2024-10-22)

Claude 3.5 Haiku features enhancements across all skill sets including coding, tool use, and reasoning. As the fastest model in the Anthropic lineup, it offers rapid response times suitable for appli ...

Anthropic 195.31K context $1/M input tokens $5/M output tokens

Anthropic: Claude 3 Opus

Text image 2 text

Claude 3 Opus is Anthropic's most powerful model for highly complex tasks. It boasts top-level performance, intelligence, fluency, and understanding. See the launch announcement and benchmark result ...

Anthropic 195.31K context $15/M input tokens $75/M output tokens $0.024/M image tokens

Anthropic: Claude 3 Haiku

Text image 2 text

Claude 3 Haiku is Anthropic's fastest and most compact model for near-instant responsiveness. Quick and accurate targeted performance. See the launch announcement and benchmark results [here](https: ...

Anthropic 195.31K context $0.25/M input tokens $1.25/M output tokens $0.4/K image tokens

Anthropic: Claude 3.5 Haiku

Claude 3.5 Haiku features enhancements across all skill sets including coding, tool use, and reasoning. As the fastest model in the Anthropic lineup, it offers rapid response times suitable for appli ...

Anthropic 195.31K context $1/M input tokens $5/M output tokens

Anthropic: Claude 3.5 Sonnet

Text image 2 text

Claude 3.5 Sonnet delivers better-than-Opus capabilities, faster-than-Sonnet speeds, at the same Sonnet prices. Sonnet is particularly good at:Coding: Autonomously writes, edits, and runs code wi...

Anthropic 195.31K context $3/M input tokens $15/M output tokens $0.005/M image tokens

AI21: Jamba 1.5 Large

Jamba 1.5 Large is part of AI21's new family of open models, offering superior speed, efficiency, and quality. It features a 256K effective context window, the longest among open models, enabling im ...

Ai21 250K context $2/M input tokens $8/M output tokens

Llama 3.1 Euryale 70B v2.2

Euryale L3.1 70B v2.2 is a model focused on creative roleplay from Sao10k. It is the successor of Euryale L3 70B v2.1. ...

Rifx.Online 8K context $0.35/M input tokens $0.4/M output tokens

Nous: Hermes 3 70B Instruct

Hermes 3 is a generalist language model with many improvements over Hermes 2, including advanced agentic capabilities, much better roleplaying, reasoning ...

NousreSearch 128K context $0.4/M input tokens $0.4/M output tokens

Nous: Hermes 3 405B Instruct

Hermes 3 is a generalist language model with many improvements over Hermes 2, including advanced agentic capabilities, much better roleplaying, reasoning, multi-turn conversation, long context cohere ...

NousreSearch 128K context $1.79/M input tokens $2.49/M output tokens

OpenAI: GPT-4o-mini

Text image 2 text

GPT-4o mini is OpenAI's newest model after GPT-4 Omni, supporting both text and image inputs with text outputs. As their most advanced small model, it is many multiples more afford ...

OpenAI 125K context $0.15/M input tokens $0.6/M output tokens $0.007/M image tokens

Google: Gemini 1.5 Flash-8B

Text image 2 text

Gemini 1.5 Flash-8B is optimized for speed and efficiency, offering enhanced performance in small prompt tasks like chat, transcription, and translation. With reduced latency, it is highly effective ...

Google 976.56K context $0.037/M input tokens $0.15/M output tokens

Inflection: Inflection 3 Productivity

Inflection 3 Productivity is optimized for following instructions. It is better for tasks requiring JSON output or precise adherence to provided guidelines For emotional intelligence similar to Pi, ...

Inflection 7.81K context $2.5/M input tokens $10/M output tokens

Inflection: Inflection 3 Pi

Inflection 3 Pi powers Inflection's Pi chatbot, including backstory, emotional intelligence, productivity, and safety. It excels in scenarios like customer support, roleplay, and emo ...

Inflection 7.81K context $2.5/M input tokens $10/M output tokens

Qwen2.5 7B Instruct

Qwen2.5 7B is the latest series of Qwen large language models. Qwen2.5 brings the following improvements upon Qwen2:Significantly more knowledge and has greatly improved capabilities in coding an...

Qwen 128K context $0.27/M input tokens $0.27/M output tokens

Rocinante 12B

Rocinante 12B is designed for engaging storytelling and rich prose. Early testers have reported:Expanded vocabulary with unique and expressive word choices Enhanced creativity for vivid narrati...

Thedrummer 32K context $0.25/M input tokens $0.5/M output tokens

Qwen2.5 72B Instruct

Qwen2.5 72B is the latest series of Qwen large language models. Qwen2.5 brings the following improvements upon Qwen2:Significantly more knowledge and has greatly improved capabilities in coding a...

Qwen 128K context $0.35/M input tokens $0.4/M output tokens

LangGraph: The Future of Advanced Multi-Agent Workflows

LangGraph: The Future of Advanced Multi-Agent Workflows

Rifx.Online
Programming , Machine Learning , Chatbots
25 Nov, 2024

The world of artificial intelligence is evolving rapidly, and tools like LangChain and LangGraph are at the forefront of enabling developers to build intelligent systems efficiently. If you’ve hea

Google: Gemini Experimental 1121 (free)

Text image 2 text

Experimental release (November 21st, 2024) of Gemini. ...

Rifx.Online 8K context $0 input tokens $0 output tokens

Google: LearnLM 1.5 Pro Experimental (free)

Text image 2 text

An experimental version of Gemini 1.5 Pro from Google. ...

Rifx.Online 8K context $0 input tokens $0 output tokens

Mistral Large 2411

Mistral Large 2 2411 is an update of Mistral Large 2 released together with Pixtral Large 2411 It is fluent in English, Fren ...

Rifx.Online 125K context $2/M input tokens $6/M output tokens

ERNIE-Bot-4.0

ERNIE Bot Overview Key Capabilities and Use Cases:Engages in interactive dialogues, answers questions, and assists with creative tasks. Facilitates efficient information ret...

Ernie bot 4.0 8K context $16.44/M input tokens $16.44/M output tokens

ERNIE-Lite-8K:free

Key Capabilities and Use Cases:Designed for resource-constrained environments like mobile and edge devices. Applicable in smart assistants, voice recognition, and localized pro...

Ernie 8K context $0 input tokens $0 output tokens

ERNIE-Bot-turbo

Developer/Company: Baidu Overview: ERNIE Bot Turbo is an enhanced version of ERNIE Bot, offering expanded capabilities with support for 7K input + 1K output. It includes system ...

Ernie 8K context $1.65/M input tokens $1.65/M output tokens

ERNIE-4.0-8K

Developer/Company: Baidu Research Key Capabilities & Use Cases: ERNIE-4.0-8K is valuable in natural language processing (NLP), applicable to search engines, intelligent custome ...

Ernie 8K context $5.48/M input tokens $16.44/M output tokens

ERNIE-Tiny-8K

Lightweight Chinese Pre-trained Language Model Developer/Company: Baidu Team Overview: ERNIE-Tiny-8K is a lightweight pre-trained language model designed for Chinese NLP tasks ...

Ernie 8K context $0 input tokens $0 output tokens

The Future of ChatGPT Explained: Everything Will Change in the Next 5 Years

The Future of ChatGPT Explained: Everything Will Change in the Next 5 Years

Rifx.Online
Chatbots , Artificial General Intelligence , Reasoners
16 Nov, 2024

This Could Take Artificial Intelligence Really, Really Far… OpenAI has laid out a clear vision for the evolution of ChatGPT, recently unveiling a five-step roadmap to reach what they c

GLM-4 AirX

Basic Information The "GLM-4-AIRX" is an advanced large language model developed by experts in the field of artificial intelligence. It is renowned for its powerful natural language ...

ChatGLM 7.81K context $1.4/M input tokens $1.4/M output tokens

glm-4-flash

GLM-4-Flash Model Introduction Key Capabilities and Primary Use CasesHandles multi-turn dialogues, web searches, and tool calls. Supports long text inference with a context...

ChatGLM 125K context $0.01/M input tokens $0.01/M output tokens

glm-4-plus

GLM-4-Plus Model Introduction Key Capabilities and Primary Use CasesLanguage Understanding: Advanced capabilities in language comprehension, instruction following, and lo...

ChatGLM 125K context $7/M input tokens $7/M output tokens

Sorcererlm 8x22b

SorcererLM is an advanced RP and storytelling model, built as a Low-rank 16-bit LoRA fine-tuned on WizardLM-2-8x22B.Advanced reasoning and emotional intelligence for engaging and im...

Raifle 15.63K context $4.5/M input tokens $4.5/M output tokens

Eva Qwen2.5 32B

A roleplaying/storywriting specialist model, full-parameter finetune of Qwen2.5-32B on mixture of synthetic and natural data. It uses Celeste 70B 0.1 data mixture, greatly expanding it ...

Eva unit 01 31.25K context $0.5/M input tokens $0.5/M output tokens

Unslopnemo 12b

UnslopNemo v4.1 is the latest addition from the creator of Rocinante, designed for adventure writing and role-play scenarios. ...

Thedrummer 31.25K context $0.5/M input tokens $0.5/M output tokens

Anthropic: Claude 3.5 Haiku (2024-10-22)

Claude 3.5 Haiku features enhancements across all skill sets including coding, tool use, and reasoning. As the fastest model in the Anthropic lineup, it offers rapid response times suit ...

Rifx.Online 195.31K context $1/M input tokens $5/M output tokens

Anthropic: Claude 3.5 Haiku

Claude 3.5 Haiku features offers enhanced capabilities in speed, coding accuracy, and tool use. Engineered to excel in real-time applications, it delivers quick response times that are ...

Anthropic 195.31K context $1/M input tokens $5/M output tokens

Google Releases Gemma — A Lightweight And Open Source Model

Google Releases Gemma — A Lightweight And Open Source Model

Rifx.Online
Natural Language Processing , Programming , Chatbots
29 Oct, 2024

In just a week, the world has witnessed the most groundbreaking AI advancements from two tech giants. OpenAI introduced its jaw-dropping AI video generator, [Sora](https://readmedium.com/3d1638

Magnum v4 72B

This is a series of models designed to replicate the prose quality of the Claude 3 models, specifically Sonnet and Opus. The model is fine-tuned on top of [Qwen2.5 72B]. ...

Anthracite org 32K context $1.875/M input tokens $2.25/M output tokens

xAI: Grok Beta

Grok Beta is xAI's experimental language model with state-of-the-art reasoning capabilities, best for complex and multi-step use cases. It is the successor of [Grok 2](https://x.ai/blo ...

X ai 128K context $5/M input tokens $15/M output tokens

Qwen2.5 7B Instruct

Qwen2.5 7B is the latest series of Qwen large language models. Qwen2.5 brings the following improvements upon Qwen2:Significantly more knowledge and has greatly improved capabilitie...

Qwen 128K context $0.27/M input tokens $0.27/M output tokens

Inflection: Inflection 3 Pi

Inflection 3 Pi powers Inflection's Pi chatbot, including backstory, emotional intelligence, productivity, and safety. It excels in scenarios like customer support, rol ...

Inflection 7.81K context $2.5/M input tokens $10/M output tokens

Inflection: Inflection 3 Productivity

Inflection 3 Productivity is optimized for following instructions. It is better for tasks requiring JSON output or precise adherence to provided guidelines For emotional intelligence s ...

Inflection 7.81K context $2.5/M input tokens $10/M output tokens

Google: Gemini 1.5 Flash-8B

Text image 2 text

Gemini 1.5 Flash-8B is optimized for speed and efficiency, offering enhanced performance in small prompt tasks like chat, transcription, and translation. With reduced latency, it is hig ...

Google 976.56K context $0.037/M input tokens $0.15/M output tokens

EVA Qwen2.5 14B

A model specializing in RP and creative writing, this model is based on Qwen2.5-14B, fine-tuned with a mixture of synthetic and natural data. It is trained on 1.5M tokens of role-play ...

Eva unit 01 32K context $0.25/M input tokens $0.5/M output tokens

Rocinante 12B

Rocinante 12B is designed for engaging storytelling and rich prose. Early testers have reported:Expanded vocabulary with unique and expressive word choices Enhanced creativity for...

Thedrummer 32K context $0.25/M input tokens $0.5/M output tokens

Meta: Llama 3.2 3B Instruct

Llama 3.2 3B is a 3-billion-parameter multilingual large language model, optimized for advanced natural language processing tasks like dialogue generation, reasoning, and summarization. ...

Meta llama 128K context $0.03/M input tokens $0.05/M output tokens

Meta: Llama 3.2 3B Instruct (free)

Llama 3.2 3B is a 3-billion-parameter multilingual large language model, optimized for advanced natural language processing tasks like dialogue generation, reasoning, and summarization. ...

Rifx.Online 4K context $0 input tokens $0 output tokens

Qwen2.5 72B Instruct

Qwen2.5 72B is the latest series of Qwen large language models. Qwen2.5 brings the following improvements upon Qwen2:Significantly more knowledge and has greatly improved capabiliti...

Qwen 128K context $0.35/M input tokens $0.4/M output tokens

Lumimaid v0.2 8B

Lumimaid v0.2 8B is a finetune of Llama 3.1 8B with a "HUGE step up dataset wise" compared to Lumimaid v0.1. Sloppy chats output were purged. Usage ...

Neversleep 128K context $0.188/M input tokens $1.125/M output tokens

Google: Gemini Flash 8B 1.5 Experimental

Text image 2 text

Gemini 1.5 Flash 8B Experimental is an experimental, 8B parameter version of the Gemini 1.5 Flash model. Usage of Gemini is subject to Google's [Gemini Term ...

Google 976.56K context $0 input tokens $0 output tokens

Llama 3.1 Euryale 70B v2.2

Euryale L3.1 70B v2.2 is a model focused on creative roleplay from Sao10k. It is the successor of Euryale L3 70B v2.1. ...

Sao10k 8K context $0.35/M input tokens $0.4/M output tokens

Nous: Hermes 3 70B Instruct

Hermes 3 is a generalist language model with many improvements over Hermes 2, including advanced agentic capabilities, much better roleplay ...

Nousresearch 128K context $0.4/M input tokens $0.4/M output tokens

Nous: Hermes 3 405B Instruct

Hermes 3 is a generalist language model with many improvements over Hermes 2, including advanced agentic capabilities, much better roleplaying, reasoning, multi-turn conversation, long ...

Nousresearch 128K context $1.79/M input tokens $2.49/M output tokens

OpenAI: ChatGPT-4o

Text image 2 text

Dynamic model continuously updated to the current version of GPT-4o in ChatGPT. Intended for research and evaluation. Note: This model is currently experimental and n ...

Openai 125K context $5/M input tokens $15/M output tokens $0.007/M image tokens

Perplexity: Llama 3.1 Sonar 405B Online

Llama 3.1 Sonar is Perplexity's latest model family. It surpasses their earlier Sonar models in cost-efficiency, speed, and performance. The model is built upon the Llama 3.1 405B and h ...

Perplexity 124.09K context $5/M input tokens $5/M output tokens $0.005/M request tokens

Llama 3 8B Lunaris

Lunaris 8B is a versatile generalist and roleplaying model based on Llama 3. It's a strategic merge of multiple models, designed to balance creativity with improved logic and general kn ...

Sao10k 8K context $2/M input tokens $2/M output tokens

Mistral Nemo 12B Starcannon

Starcannon 12B is a creative roleplay and story writing model, using nothingiisreal/mn-celeste-12b as a base and [intervitens/mini ...

Aetherwiing 11.72K context $2/M input tokens $2/M output tokens

Perplexity: Llama 3.1 Sonar 70B Online

Llama 3.1 Sonar is Perplexity's latest model family. It surpasses their earlier Sonar models in cost-efficiency, speed, and performance. This is the online version of the [offline chat ...

Perplexity 124.09K context $1/M input tokens $1/M output tokens $0.005/M request tokens

Perplexity: Llama 3.1 Sonar 8B Online

Llama 3.1 Sonar is Perplexity's latest model family. It surpasses their earlier Sonar models in cost-efficiency, speed, and performance. This is the online version of the [offline chat ...

Perplexity 124.09K context $0.2/M input tokens $0.2/M output tokens $0.005/M request tokens

Meta: Llama 3.1 70B Instruct

Meta's latest class of model (Llama 3.1) launched with a variety of sizes & flavors. This 70B instruct-tuned version is optimized for high quality dialogue usecases. It has demonstrate ...

Meta llama 128K context $0.3/M input tokens $0.3/M output tokens

Meta: Llama 3.1 70B Instruct (free)

Meta's latest class of model (Llama 3.1) launched with a variety of sizes & flavors. This 70B instruct-tuned version is optimized for high quality dialogue usecases. It has demonstrate ...

Rifx.Online 8K context $0 input tokens $0 output tokens

Google: Gemma 2 27B

Gemma 2 27B by Google is an open model built from the same research and technology used to create the Gemini models. Gemma models are well-suited for a variety of t ...

Google 8K context $0.27/M input tokens $0.27/M output tokens

Anthropic: Claude 3.5 Sonnet (2024-06-20)

Text image 2 text

Claude 3.5 Sonnet delivers better-than-Opus capabilities, faster-than-Sonnet speeds, at the same Sonnet prices. Sonnet is particularly good at:Coding: Autonomously writes, edits, an...

Anthropic 195.31K context $3/M input tokens $15/M output tokens $0.005/M image tokens

Dolphin 2.9.2 Mixtral 8x22B 🐬

Dolphin 2.9 is designed for instruction following, conversational, and coding. This model is a finetune of Mixtral 8x22B Instruct. It features a 64k ...

Cognitivecomputations 64K context $0.9/M input tokens $0.9/M output tokens

Mistral: Mistral 7B Instruct

A high-performing, industry-standard 7.3B parameter model, with optimizations for speed and context length. *Mistral 7B Instruct has multiple version variants, and this is intended to ...

Mistralai 32K context $0.055/M input tokens $0.055/M output tokens

Mistral: Mistral 7B Instruct (free)

A high-performing, industry-standard 7.3B parameter model, with optimizations for speed and context length. *Mistral 7B Instruct has multiple version variants, and this is intended to ...

Rifx.Online 8K context $0 input tokens $0 output tokens

Phi-3 Mini 128K Instruct (free)

Phi-3 Mini is a powerful 3.8B parameter model designed for advanced language understanding, reasoning, and instruction following. Optimized through supervised fine-tuning and preference ...

Rifx.Online 8K context $0 input tokens $0 output tokens

Phi-3 Medium 128K Instruct (free)

Phi-3 128K Medium is a powerful 14-billion parameter model designed for advanced language understanding, reasoning, and instruction following. Optimized through supervised fine-tuning a ...

Rifx.Online 8K context $0 input tokens $0 output tokens

DeepSeek V2.5

DeepSeek-V2.5 is an upgraded version that combines DeepSeek-V2-Chat and DeepSeek-Coder-V2-Instruct. The new model integrates the general and coding abilities of the two previous version ...

Deepseek 125K context $0.14/M input tokens $0.28/M output tokens

Google: Gemini Flash 1.5

Text image 2 text

Gemini 1.5 Flash is a foundation model that performs well at a variety of multimodal tasks such as visual understanding, classification, summarization, and creating content from image, ...

Google 976.56K context $0.075/M input tokens $0.3/M output tokens $0.04/K image tokens

WizardLM-2 7B

WizardLM-2 7B is the smaller variant of Microsoft AI's latest Wizard model. It is the fastest and achieves comparable performance with existing 10x larger opensource leading models It ...

Microsoft 31.25K context $0.055/M input tokens $0.055/M output tokens

WizardLM-2 8x22B

WizardLM-2 8x22B is Microsoft AI's most advanced Wizard model. It demonstrates highly competitive performance compared to leading proprietary models, and it consistently outperforms all ...

Microsoft 64K context $0.5/M input tokens $0.5/M output tokens

Google: Gemini Pro 1.5

Text image 2 text

Google's latest multimodal model, supporting image and video in text or chat prompts. Optimized for language tasks including:Code generation Text generation Text editing Prob...

Google 1.91M context $1.25/M input tokens $5/M output tokens $0.003/M image tokens

Anthropic: Claude 3 Haiku

Text image 2 text

Claude 3 Haiku is Anthropic's fastest and most compact model for near-instant responsiveness. Quick and accurate targeted performance. See the launch announcement and benchmark results ...

Anthropic 195.31K context $0.25/M input tokens $1.25/M output tokens $0.4/K image tokens

Anthropic: Claude 3 Opus

Text image 2 text

Claude 3 Opus is Anthropic's most powerful model for highly complex tasks. It boasts top-level performance, intelligence, fluency, and understanding. See the launch announcement and be ...

Anthropic 195.31K context $15/M input tokens $75/M output tokens $0.024/M image tokens

Anthropic: Claude 3 Sonnet

Text image 2 text

Claude 3 Sonnet is an ideal balance of intelligence and speed for enterprise workloads. Maximum utility at a lower price, dependable, balanced for scaled deployments. See the launch an ...

Anthropic 195.31K context $3/M input tokens $15/M output tokens $0.005/M image tokens

Mistral Tiny

This model is currently powered by Mistral-7B-v0.2, and incorporates a "better" fine-tuning than Mistral 7B, inspired by community work. It's best ...

Mistralai 31.25K context $0.25/M input tokens $0.25/M output tokens

Dolphin 2.6 Mixtral 8x7B 🐬

This is a 16k context fine-tune of Mixtral-8x7b. It excels in coding tasks due to extensive training with coding data and is known for its obedience, although ...

Cognitivecomputations 32K context $0.5/M input tokens $0.5/M output tokens

lzlv 70B

A Mythomax/MLewd_13B-style merge of selected 70B models. A multi-model merge of several LLaMA2 70B finetunes for roleplaying and creative work. The goal was to create a model that combi ...

Lizpreciatior 4K context $0.35/M input tokens $0.4/M output tokens

Toppy M 7B

A wild 7B parameter model that merges several models using the new task_arithmetic merge method from mergekit. List of merged models:NousResearch/Nous-Capybara-7B-V1.9 [HuggingFace...

Undi95 4K context $0.07/M input tokens $0.07/M output tokens

Google: PaLM 2 Chat 32k

PaLM 2 is a language model by Google with improved multilingual, reasoning and coding capabilities. ...

Google 31.99K context $1/M input tokens $2/M output tokens

Google: PaLM 2 Code Chat 32k

PaLM 2 fine-tuned for chatbot conversations that help with code-related questions. ...

Google 31.99K context $1/M input tokens $2/M output tokens

OpenAI: GPT-3.5 Turbo Instruct

This model is a variant of GPT-3.5 Turbo tuned for instructional prompts and omitting chat-related optimizations. Training data: up to Sep 2021. ...

Openai 4K context $1.5/M input tokens $2/M output tokens

ReMM SLERP 13B

A recreation trial of the original MythoMax-L2-B13 but with updated models. #merge ...

Undi95 4K context $1.125/M input tokens $1.125/M output tokens

ReMM SLERP 13B (extended)

A recreation trial of the original MythoMax-L2-B13 but with updated models. #merge _These are extended-context endpoints for ReMM SLERP 13B. They may have ...

Undi95 6K context $1.125/M input tokens $1.125/M output tokens