A new AI model developed by researchers at Stanford University demonstrates a significant leap in multimodal reasoning, capable of analyzing and answering complex questions based on both text and images. The system, named 'M3', was trained on a massive dataset of paired text and visual information, allowing it to perform tasks like interpreting charts, answering …
A new AI model developed by researchers at Stanford University demonstrates a significant leap in multimodal reasoning, capable of analyzing and answering complex questions based on both text and images. The system, named ‘M3’, was trained on a massive dataset of paired text and visual information, allowing it to perform tasks like interpreting charts, answering questions about photographs, and generating descriptive captions. Initial benchmarks show M3 outperforming previous state-of-the-art models on several standardized tests for AI comprehension. The researchers emphasize the model’s potential applications in education, accessibility tools, and scientific research, while also noting the ongoing challenges of mitigating biases present in the training data. Read the full article at: https://technologyreview.com/2024/05/15/ai-model-multimodal-reasoning-breakthrough
Join the Club
Like this story? You’ll love our Bi-Weekly Newsletter



