vision

How to Build Your Own OCR Assistant with Streamlit and Llama 3.2-Vision

Rifx.Online
Programming , Technology , Computer Vision
27 Dec, 2024

Learn with example OCR (Optical Character Recognition) is a tool that helps automate the process of converting images into text. You must have used it in your phone as it is very common no

Multilingual Vision Captioning: A Multi-Model Multimodal Approach to Image and Video Captioning and…

Rifx.Online
Natural Language Processing , Computer Vision , Generative AI
26 Dec, 2024

Using a combination of Meta’s Llama 3.2 11B Vision Instruct, Facebook’s 600M NLLB-200, and LLaVA-Next-Video 7B models to produce multilingual image and video captions, descriptive tags, a

Qwen2-VL: A Vision Language Model That Runs Locally

Rifx.Online
Natural Language Processing , Computer Vision , Technology/Web
15 Dec, 2024

This is an introduction to「Qwen2-VL」, a machine learning model that can be used with ailia SDK. You can easily use this model to create AI applications using [ailia SDK](h

Generating structured data from an image with GPT vision and Langchain

Rifx.Online
Programming , Computer Vision , Natural Language Processing
24 Oct, 2024

In today’s world, where visual data is abundant, the ability to extract meaningful information from images is becoming increasingly valuable. Langchain, a powerful framework for building applica