Google: Gemini Pro Vision 1.0
- 16K Context
- 0.5/M Input Tokens
- 1.5/M Output Tokens
- 0.003/M Image Tokens
- Text image 2 text
- 13 Dec, 2023
Google’s flagship multimodal model, supporting image and video in text or chat prompts for a text or code response.
See the benchmarks and prompting guidelines from Deepmind.
Usage of Gemini is subject to Google’s Gemini Terms of Use.
#multimodal