谷歌新Gemini AI模型在基准测试中击败GPT-4o
这是谷歌首次在 Chatbot Arena 排行榜上占据榜首。
JinseFinanceAt its Google I/O developer conference in Mountain View, California, on Tuesday, Google unveiled a series of generative artificial intelligence (AI) products, including the Gemini Live assistant, updates to its Android and Workspaces platforms, and a revamped Search product.
Today Google announced groundbreaking new AI technology at Google IO.
— Lior⚡ (@AlphaSignalAI) May 14, 2024
The 10 most incredible examples:
These announcements are part of Google's broader strategy to reclaim its position as Silicon Valley's AI leader, following Microsoft's surprising partnership with OpenAI in 2022.
Additionally, Google aims to diversify beyond its core advertising business with new devices and AI-powered tools.
Google I/O Developer's Conference
— Zeeshan Ali Khan🇵🇰 (@zshan_ali5) May 15, 2024
has unveiled a whole new universe of AI. For better understanding and to know what that conference brings for the technology enthusiasts. Follow the thread 🧵 Total 17
📷 image credit @Google & @Reuters#GoogleIO#Google#GoogleAIpic.twitter.com/i5Tp3uR8iY
Emphasizing the importance of AI, Google CEO Sundar Pichai noted that the term "AI" was mentioned 120 times during the event, as counted by Google's AI platform, Gemini.
This flurry of updates follows OpenAI's recent launch of its latest AI system, GPT4o, which showcased advanced capabilities like reading human expressions via a phone camera and engaging in fluent, even flirtatious, conversations.
Google is clearly intent on demonstrating that its AI tools are equally proficient in this type of "multimodal" understanding.
In a clear demonstration of the competitive "anything you can do, I can do better" mindset, Google strategically previewed its AI systems running on a phone just before OpenAI's announcement.
You can watch recaps from the Google I/O Conference here.
Get ready to #GoogleIO! 🎪🙌
— Google AI (@GoogleAI) May 14, 2024
Tune in to the livestream at 10am PT to hear about the latest launches, news, and AI updates from Google. → https://t.co/UtopzadtTd
During the keynote, Google demonstrated its vision of integrating AI into users' daily lives, showcasing how its AI products can assist with sharing information, interacting with others, finding objects around the house, making schedules, shopping, and using Android devices.
Google aims for its AI to become an integral part of everything users do.
Pichai introduced several new features powered by its latest AI model, Gemini 1.5 Pro.
Gemini 1.5 Pro, with a 1 million token long context window, is now available with Gemini Advanced in 35+ languages. 🎉 #GoogleIOpic.twitter.com/qzYKax4XRT
— Google (@Google) May 14, 2024
One notable feature, called Ask Photos, enables users to search their photo library for specific insights, such as identifying when their daughter learned to swim or recalling their license plate number from saved images.
Pichai also showcased how Gemini 1.5 Pro can summarise recent emails from a child's school by analysing attachments and extracting key points and action items.
Two versions of the Gemini 1.5 Pro model were unveiled: Gemini 1.5 Pro Flash, a lightweight, fast, and cost-efficient iteration with multimodal capabilities and a 1M token context length, boasting an MMLU of 78.9% compared to the original model's 81.9%. The standard Gemini 1.5 Pro model now features a doubled context length of 2M tokens.
Introducing Gemini 1.5 Flash ⚡
— Google (@Google) May 14, 2024
It’s a lighter-weight model, optimized for tasks where low latency and cost matter most. Starting today, developers can use it with up to 1 million tokens in Google AI Studio and Vertex AI. #GoogleIOpic.twitter.com/I1adecF9UT
This new model is available via a waitlist for select developers through the API.
5. Gemini 1.5 Pro is now available for all.
— Lior⚡ (@AlphaSignalAI) May 14, 2024
They also increased the context window to 2 million tokens. This is the equivalent of 15 books, 1.5 million words. pic.twitter.com/JDqZQEsKhO
Throughout the presentation, Google executives highlighted other capabilities, such as the latest model's ability to "read" a textbook and transform it into an AI lecture with natural-sounding teachers who can answer questions.
Last May, Pichai had announced the company's ambitious plan to reimagine all its products through AI.
However, given the risks associated with new generative AI technology, such as the potential for spreading false information, Google was initially cautious about integrating it into its search engine, which serves over two billion users and generated $175 billion in revenue last year.
At the conference, Pichai unveiled how the company's dedicated work on AI has now been incorporated into its search engine.
Starting this week, United States (US) users will experience a new feature, AI Overviews, previously known as the Search Generative Experience (SGE) which was announced at Google I/O 2023.
This feature generates information summaries above traditional search results, and it will soon be available to users worldwide.
By the end of the year, over a billion people are expected to have access to this technology.
Liz Reid, Google's newly installed head of Search, said:
“What we see with generative AI is that Google can do more of the searching for you. It can take a bunch of the hard work out of searching, so you can focus on the parts you want to do to get things done, or on the parts of exploring that you find exciting."
By the end of the year, AI Overviews will come to over a billion people in Google Search.
— Google (@Google) May 14, 2024
With Search in the Gemini era, Google does more of the work for you — from answering your most complex questions to helping you get things done.#GoogleIOpic.twitter.com/ImfL6E4O46
So how does AI Overviews work?
Google's new experience integrates generative AI with search results to provide AI-generated summaries and answers based on live information.
Powered by the Gemini AI model, this enhancement will present AI Overviews for many queries when the system identifies that generative AI could be helpful.
These AI-generated summaries will appear above traditional search results, pushing them further down the page.
Typically, AI Overviews display a few relevant links per query, but they only become fully visible after expanding the response.
Google compares AI Overviews to features like Knowledge Panels or Featured Snippets, and they cannot be completely disabled.
However, Google will introduce a "web" filter in Search to bypass AI responses and show only traditional links.
A major concern about Google's AI-enhanced Search is its impact on websites that rely heavily on search traffic.
A major concern is that AI Overviews could potentially intensify worries among web publishers about reduced traffic from Google Search, exacerbating challenges within an industry already strained by conflicts with other tech platforms.
On Google, users will encounter longer summaries on various topics, potentially diminishing the need to visit external websites.
Some estimates suggest that websites could lose up to 25% of their traffic over the next few years due to this change, compounding recent declines caused by Search algorithms.
However, Google asserts that the links included in AI Overviews receive more clicks than those in traditional search results.
The company emphasizes its commitment to directing traffic to publishers and creators as AI Overviews reach more users.
In a recent blog post, Reid revealed that the links featured in AI Overviews receive more clicks from users compared to when they are presented as traditional search results.
Reid added:
“We'll continue to focus on sending valuable traffic to publishers and creators.”
Additionally, Google has announced new features to be tested with Labs participants in Search.
These features include options to refine AI Overviews by simplifying language, enabling multi-step reasoning for complex queries, providing planning capabilities, organising search results with AI, and incorporating video as part of search prompts.
Google hints that these developments are just the beginning of its efforts to reimagine Google Search, with more innovations on the horizon.
Google's latest unveiling also includes Gemini Live, a personalised AI assistant poised to revolutionise user interactions.
Powered by Google's advanced Gemini 1.5 Pro model, Gemini Live offers users the ability to engage with a chatbot through voice commands, with responses delivered in natural-sounding voices.
What sets this apart is the chatbot's adaptability, allowing users to interrupt and ask clarifying questions mid-conversation.
Amar Subramanya, Google's vice president of engineering for Gemini experiences, shared insights into the transformative potential of Gemini Live during an interview with Yahoo Finance.
Subramanya revealed his personal utilisation of Gemini Live for brainstorming sessions and idea exchanges, showcasing the assistant's versatility in aiding creative processes.
Early testers have also explored Gemini Live's capabilities, leveraging it for tasks such as translation with promising results.
Looking ahead, Google plans to integrate camera access into Gemini Live, empowering the assistant to interact with real-world environments and objects—a feature reminiscent of OpenAI's GPT4o demonstrations.
Subramanya recounted a scenario where he tasked the assistant with sourcing a pineapple upside-down cake recipe for a gathering of 15 people and seamlessly adding the ingredients to his Keep shopping list.
The assistant adeptly adjusted a recipe meant for eight individuals, scaled the proportions accordingly, and efficiently compiled the necessary items for Subramanya's convenience.
Additionally on the Android front, Google is extending its assistant's reach to popular apps like Google Messages and Gmail, enhancing user productivity by enabling tasks like inserting Gemini-generated images into messages.
Google's Gemini Nano boasts the ability to identify potential phone scammers during conversations.
This feature operates by detecting specific conversation patterns commonly linked with fraudulent activities.
Detecting scams during calls
— Zeeshan Ali Khan🇵🇰 (@zshan_ali5) May 15, 2024
Google previewed a feature which entirely runs on the communication patterns of scammers. During a suspected scam call it will alert users.#GoogleAI#GoogleIO#Google
9/17
Remarkably, all scam detection processing occurs locally on your device, ensuring privacy as conversations remain confined to your phone without being uploaded to the web.
Google briefly unveiled Project Astra, a creation of its DeepMind AI lab, poised to revolutionise everyday life by harnessing phone cameras to interpret real-world information.
3. Astra, the future of AI assistants.
— Lior⚡ (@AlphaSignalAI) May 14, 2024
It can interact with the world around it by taking in information, remembering what it sees, processing that information and understanding contextual details. pic.twitter.com/WvzHW99CCy
This endeavour promises to identify objects and even locate misplaced items, hinting at a future integration with augmented reality glasses.
Introducing Project Astra, our vision for the future of AI assistants, available to try live at Google I/O. Super proud of what the team has achieved! 👏🎉
— Alexandre Moufarek (@amoufarek) May 14, 2024
You can watch it in action on a phone and glasses below. Each part was captured in a single take, in real time. #GoogleIOpic.twitter.com/KVYmwZ9oVY
Demis Hassabis, chief executive of DeepMind, detailed in a blog post that select capabilities of Project Astra will be accessible to Gemini chatbot users this year.
For a long time, we’ve been working towards a universal AI agent that can be truly helpful in everyday life. Today at #GoogleIO we showed off our latest progress towards this: Project Astra. Here’s a video of our prototype, captured in real time. pic.twitter.com/TSGDJZVslg
— Demis Hassabis (@demishassabis) May 14, 2024
Powered by Gemini, this project offers real-time support across audio, text, video, and image formats.
Despite being presented as a prototype, Astra's potential was showcased through pre-recorded videos, as it remains unavailable to all users.
Early testers noted a longer latency and perceived limitations in emotional intelligence and tone compared to GPT4o.
This is Project Astra in real-time, live and unscripted. Michael shared this with us as the GPT-4o demos were happening 🤯
— Alexandre Moufarek (@amoufarek) May 15, 2024
Fantastic work from Micheal and the team, so happy to see it finally shared and congrats to OpenAI on the impressive demos! https://t.co/ETXADCfxtZ
However, Astra exhibits strong text-to-speech capabilities and potentially superior support for ongoing video and long-context interactions.
Next in line for Google is Veo, its latest AI model designed to produce high-definition videos from simple text inputs, akin to OpenAI's Sora system.
#Google's AI model 'Veo' can be used by creators, says James Manyika, Senior VP, Google on the growing possibilities of using gen AI. The company unveiled Veo as its most advanced video generation model at Google I/O 2024 conference.@Google@AshmitTejKumar#GoogleIO#AI#Veopic.twitter.com/2WcOS1YDNN
— CNBC-TV18 (@CNBCTV18News) May 15, 2024
a
This technology marks a significant advancement in video generation capabilities, promising creators the ability to preview Veo and join a waitlist for access.
Anticipation mounts as Google plans to integrate Veo's functionalities into YouTube Shorts and other platforms later this year.
Veo, developed by Google DeepMind, boasts impressive features:
-It delivers videos in stunning 1080p resolution.
-Videos can extend beyond a minute, offering flexibility in content creation.
-Veo offers a diverse array of cinematic and visual styles to suit various preferences.
Veo
— Zeeshan Ali Khan🇵🇰 (@zshan_ali5) May 15, 2024
AI models that can create 1080p video clips around a minute long given a text prompt.#GoogleIO#GoogleAI#Google
16/17
This versatile model can animate images or edit videos based on textual prompts, with support for masked editing, enabling targeted modifications within videos.
Google has enhanced Veo's training data by enriching video captions with additional details.
Furthermore, Veo leverages compressed representations of video, known as latents, to enhance performance, generation speed, and efficiency.
The 2-hour session brimmed with a wealth of product updates and announcements spanning the Google ecosystem, showcasing enhancements across Search, Workspace, Photos, Android, and beyond.
Notably, Imagen 3, their cutting-edge image generation model, will soon debut in multiple iterations tailored for diverse tasks, from rapid sketching to producing high-resolution images.
Also Gemma 2 and PaliGemma, two new additions to the Gemma family, marks a significant stride in open-source models.
PaliGemma, Google's inaugural vision-language open-source model, is available now, while Gemma 2, boasting 27 billion parameters, surpasses its predecessor and launches in June.
Furthermore, the unveiling of Lyria, Google's music-generation tool, adds another dimension to their innovative offerings.
With over 15 project launches and product announcements, the event underscores Google's commitment to advancing technology across various domains.
Google products for I/O2024 (2)
— Zeeshan Ali Khan🇵🇰 (@zshan_ali5) May 15, 2024
> Tensor Processing Units get a performance boost
> AI in search
> Generative AI upgrades
> Project IDX
> Veo
> Circle to Search#GoogleIO#Google#GoogleAI
2/17
In the eyes of analyst Jacob Bourne from Emarketer, the spotlight on AI at this year's Google developer conference comes as no surprise.
He said:
“By showcasing its latest models and how they'll power existing products with strong consumer reach, Google is demonstrating how it can effectively differentiate itself from rivals."
He views the reception of these new tools as a litmus test for Google's ability to adapt its search product to the evolving landscape of generative AI.
He added:
“To maintain its competitive edge and satisfy investors, Google will need to focus on translating its AI innovations into profitable products and services at scale.”
As the company expands its AI endeavours, it pledges to implement additional safeguards to mitigate potential misuse.
Moreover, Google underscores its commitment to refining the capabilities of its new models through partnerships with experts and institutions.
However, while Google has intensified its focus on AI over the past year, it has encountered notable hurdles along the way.
One such setback occurred last year when the introduction of its generative AI tool, initially named Bard and later rebranded as Gemini, led to a drop in the company's share price.
This decline followed a demo video showcasing the tool's production of factually inaccurate responses to inquiries about the James Webb Space Telescope.
More recently, in February, Google faced criticism on social media for Gemini's depiction of historically inaccurate images, predominantly featuring people of colour instead of individuals of White ethnicity.
In response, the company halted Gemini's capability to generate images of people.
Like other AI tools such as ChatGPT, Gemini draws from extensive datasets available online.
However, experts have consistently cautioned against the limitations and potential pitfalls associated with AI technologies, including inaccuracies, biases, and the dissemination of misinformation.
And with the mention of rivalry, ChatGPT emerged as a formidable contender in the tech industry upon its release in late 2022, sparking discussions about its potential threat to Google's dominant search engine, the go-to platform for online information retrieval.
OpenAI is about to go after Google search.
— Lior⚡ (@AlphaSignalAI) May 2, 2024
This could be the most serious threat Google has ever faced.
OpenAI's SSL certificate logs now show they created https://t.co/zwb81WWhoc
Microsoft Bing would allegedly power the service.
This shouldn’t be too surprising,… pic.twitter.com/64D97lxFHp
In response, Google embarked on a determined journey to reclaim its supremacy in the realm of AI.
On a positive note, at Oppenheimer, analyst Jason Helfstein said in a report:
"Relative to OpenAI's limited product demo the day before, we believe Google demonstrated its strong competitive position, driven by an essentially unlimited R&D budget."
Evercore ISI analyst Mark Mahaney also said in a report:
"In our view, Google delivered in this year's I/O against the mounting hype and doubts. From this I/O, we also noticed a greater emphasis from Google on using gen AI to more tightly connect its services into one holistic experience. And, an emphasis on these new innovations being 'Only On Android'."
However, other tech giants are very close behind.
At its Build conference starting 20 May, Microsoft is expected to unveil enhancements to its AI-driven Copilot for the Microsoft 365 productivity suite.
Meanwhile, Apple is gearing up for its WWDC event on 10 June, where it plans to introduce a new iteration of its Siri voice assistant powered by generative AI.
As the battle for AI supremacy intensify, who will emerge victorious?
It seems like when one releases a" groundbreaking" innovation, another will be right on its tail.
So only time will tell, not so much who will emerge the winner but who will be left behind.
这是谷歌首次在 Chatbot Arena 排行榜上占据榜首。
JinseFinanceOpenAI stated that the launch of GPT-4o mini marks significant progress in reducing costs and enhancing model capabilities, and is committed to making AI more popular and reliable.
WenJunOpenAI launched "GPT-4o mini" on 18 July, claiming it’s cheaper and more efficient than GPT-3.5 Turbo. Is OpenAI mimicking Apple's frequent releases, and will this impact the quality of their generative AI models?
KikyoRecent resignations at OpenAI, including Chief Scientist Ilya Sutskever and Jan Leike from the "superalignment team," follow disagreements over prioritising safety amid the launch of GPT-4o. Concerns linger over OpenAI's shift towards profit and potential security risks in collaborations like Apple's iOS 18 update integrating OpenAI technology.
Weatherly人工智能,谷歌,Gemini,谷歌反击:Project Astra正面硬刚GPT-4o Veo对抗Sora 金色财经,这就是谷歌对 OpenAI 的回应。
JinseFinanceChatGPT 问世才 17 个月,OpenAI 就拿出了科幻电影里的超级 AI,而且完全免费,人人可用。
JinseFinanceOpenAI周一宣布了其最新的人工智能大语言模型,据称该模型将使ChatGPT更智能、更易于使用。
JinseFinanceRilis terbaru mengubah interaksi AI dengan memperluas basis pengetahuan hingga April 2023 dan memperkenalkan dukungan untuk dokumen setebal 300 halaman.
JixuDropbox rupanya melihat lonjakan pengguna penambangan kripto dan penyatuan penyimpanan karena platform lain membuat perubahan kebijakan serupa
Alex