Skip to content

Elon Musk’s xAI launches Grok 1.5 vision AI model

By | Published | No Comments

Elon Musk’s artificial intelligence (AI) company xAI has launched a new artificial intelligence model called Grok 1.5 Vision. This Large Language Model (LLM) is an enhanced version of the recently released Grok 1.5 model. With this upgrade, the AI ​​model is now equipped with computer vision, allowing it to accept visual media as input. It can process images and answer questions about them. Notably, the announcement comes just days after OpenAI launched its own computer vision-powered GPT-4 model.

The news was released by xAI’s official X (formerly Twitter) account.The company shared a blog post Detailed the new AI model and shared some of its benchmark scores.Due to the recent addition of visual features unveiling Grok 1.5 model, most details remain unchanged. It has the same contextual window of 1,28,000 coins and the general benchmark score is likely to remain the same as well.

xAI also shared benchmark scores for the Grok 1.5 Vision tested in a benchmark developed by the company. The AI ​​company calls it the RealWorldQA benchmark, which measures “real-world spatial understanding.” It also tested the model on several other benchmarks such as MMMU, Mathvista, ChartQA, etc. Although Grok performs better than OpenAI’s GPT-4 with Vision and Gemini 1.5 Pro in RealWorldQA, it scores lower in MMMU and ChartQA.

For those unfamiliar, computer vision is a branch of computer science that focuses on enabling computers (and artificial intelligence models) to use images and videos to recognize and understand objects in the real world. Its purpose is to help computers observe and process visual signals like humans. With the rise of multimodal AI models, many companies are now focused on developing vision-centric models. Google’s Gemini 1.5 Pro and OpenAI’s GPT-4 with Vision both have this feature.

This technology also offers a wide range of applications. Healthify, an Indian calorie tracking and nutritional feedback platform, recently added a feature called Snap, where users can click on pictures of food or dishes, while GPT-4 with a vision-driven AI chatbot will suggest how to make recipes better. Health, and how much a person needs to do some exercise to burn excess calories. In the future, artificial intelligence models with computer vision can assist in disease diagnosis, manufacturing self-driving cars, and more.


Affiliate links may be automatically generated – see our Ethics Statement for details.

Follow us on Google news ,Twitter , and Join Whatsapp Group of thelocalreport.in

Surja, a dedicated blog writer and explorer of diverse topics, holds a Bachelor's degree in Science. Her writing journey unfolds as a fascinating exploration of knowledge and creativity.With a background in B.Sc, Surja brings a unique perspective to the world of blogging. Hers articles delve into a wide array of subjects, showcasing her versatility and passion for learning. Whether she's decoding scientific phenomena or sharing insights from her explorations, Surja's blogs reflect a commitment to making complex ideas accessible.