Google's Gemini 1.5 Pro Tops AI Benchmarks, Surpasses ChatGPT-4o

2024/08/02 14:09

关注

On August 1, Google's Gemini 1.5 Pro quietly launched and quickly made headlines by surpassing OpenAI's ChatGPT-4o in generative AI benchmarks. The new model, labelled as experimental, has become the top performer in the AI community, according to recent benchmark scores.

Benchmarking AI Models

OpenAI's ChatGPT has been a benchmark leader in generative AI since GPT-3. Its latest iteration, GPT-4o, along with Anthropic's Claude-3, had dominated most common benchmarks for the past year. One of the key tests, the LMSYS Chatbot Arena, evaluates AI models on various tasks and assigns an overall competency score. GPT-4o previously held a score of 1,286, while Claude-3 scored 1,271.

The previous version of Gemini 1.5 Pro scored 1,261. However, the latest experimental version (Gemini 1.5 Pro 0801) achieved a score of 1,300, indicating a higher overall capability than its competitors. While benchmark scores provide an indication of performance, they don't fully capture the range of capabilities or limitations of an AI model.

Exciting News from Chatbot Arena!@GoogleDeepMind's new Gemini 1.5 Pro (Experimental 0801) has been tested in Arena for the past week, gathering over 12K community votes.

For the first time, Google Gemini has claimed the #1 spot, surpassing GPT-4o/Claude-3.5 with an impressive… https://t.co/SvjBegXbQ9 pic.twitter.com/6MTHdty1jb
— lmsys.org (@lmsysorg) August 1, 2024

Community Reaction

The AI community has responded with enthusiasm to Gemini 1.5 Pro's release. Social media buzz highlighted the model's impressive performance, with some users describing it as "insanely good" and even surpassing ChatGPT-4o. One Redditor noted that it "blows 4o out of the water," reflecting the excitement surrounding the new model.

Future Considerations

It remains uncertain if the experimental version of Gemini 1.5 Pro will become the default model. Given its current status as an early release or testing phase, the model could potentially be altered or withdrawn for safety or alignment reasons.

了解更多行业报道，与作者、读者更深入探讨、交流，欢迎加入Coinlive社群：https://t.me/CoinliveSG

添加评论

登录留下您的精彩评论……

0 评论

最早的

加载更多评论

Google's Gemini 1.5 Pro Tops AI Benchmarks, Surpasses ChatGPT-4o

Benchmarking AI Models

Community Reaction

Future Considerations

更多新闻 gemini ai performance

更多新闻 gemini ai performance

谷歌：将通过Gemini 2.0驱动的AI代理开启新代理时代

OpenAI's ChatGPT Dominates AI Searches While Google's Gemini Struggles to Keep Up

Google's Gemini AI to Boost Your Car's Android Auto Experience with Smarter Features

谷歌新Gemini AI模型在基准测试中击败GPT-4o

谷歌警告双子座人工智能用户不要共享机密信息

生成式人工智能硬件的进步推动经济性和性能的提高

谷歌的 Gemini Pro：为开发人员和企业带来人工智能革命

谷歌与 gemini 共同探索人工智能聊天机器人 "艾尔曼项目

谷歌推出史上最强的本地多模态人工智能模型 "双子座

双子座对美国证券交易委员会有关双子座盈利的诉讼提出质疑