multimodal-model

Google: Gemini Pro Vision 1.0

Google's flagship multimodal model, supporting image and video in text or chat prompts for a text or code response. See the benchmarks and prompting guidelines from [Deepmind](https://deepmind.googl ...

Google 16K context $0.5/M input tokens $1.5/M output tokens $0.003/M image tokens

Google: Gemini Pro 1.5

Text image 2 text

Google's latest multimodal model, supporting image and video in text or chat prompts. Optimized for language tasks including:Code generation Text generation Text editing Problem solving...

Google 1.91M context $1.25/M input tokens $5/M output tokens $0.003/M image tokens

FREE

Google: Gemini Pro 1.5 Experimental

Text image 2 text

# Free

Google's latest multimodal model, supporting image and video in text or chat prompts. Optimized for language tasks including:Code generation Text generation Text editing Problem solving...

Google 1.91M context $0 input tokens $0 output tokens $0.003/M image tokens

Multimodal model

Google: Gemini Pro Vision 1.0

Google: Gemini Pro 1.5

Google: Gemini Pro 1.5 Experimental