[ad_1]
Google CEO Sundar Pichai speaks at the Google I/O developer conference.
Andrej Sokolow | Image Alliance | fake images
Tuesday’s announcements follow similar events held by its AI competitors. Earlier this month, Amazon-backed Anthropic announced its first enterprise offering and a free iPhone app. Meanwhile, OpenAI on Monday launched a new AI model and a desktop version of ChatGPT, along with a new user interface.
This is what Google announced.
Google introduced updates for Gemini 1.5 Pro, its artificial intelligence model that will soon be able to handle even more data; For example, the tool can summarize 1,500 pages of text uploaded by a user.
There’s also a new Gemini 1.5 Flash AI model, which the company says is more cost-effective and designed for smaller tasks like quickly summarizing conversations, captioning images and videos, and extracting data from large documents.
Google CEO Sundar Pichai highlighted improvements in Gemini translations, adding that it will be available to all developers worldwide in 35 languages. Within Gmail, Gemini 1.5 Pro will analyze PDF files and video attachments, providing summaries and more, Pichai said. That means if you missed a long email thread while on vacation, Gemini will be able to summarize it along with any attachments.
The new Gemini updates are also useful for searching Gmail. An example the company gave: If you’ve been comparing prices from different contractors to fix your roof and are looking for a summary to help you decide who to choose, Gemini could return three quotes along with the anticipated start dates offered in the different emails. threads.
Google said Gemini will eventually replace Google Assistant on Android phones, suggesting it will be a more powerful competitor to Apple’s Siri on the iPhone.
Google announced “I Spy,” its latest model for generating high-definition video, and Image 3, its highest-quality text-to-image conversion model, promising realistic images and “fewer distracting visual artifacts than our previous models.”
The tools will be available to select creators on Monday and will come to Vertex AI, Google’s machine learning platform that allows developers to train and deploy AI applications.
The company also introduced “Audio Overviews,” the ability to generate audio discussions based on text input. For example, if a user uploads a lesson plan, the chatbot can summarize it. Or, if you ask for an example of a real-life scientific problem, you can do so via interactive audio.
Separately, the company also introduced “AI Sandbox”, a range of generative AI tools to create music and sounds from scratch, based on user input.
However, generative AI tools such as chatbots and imagers still suffer from accuracy issues.
Google search chief Prabhakar Raghavan told employees last month that competitors “may have a new gadget that people like to play with, but they still go to Google to verify what they see there because it’s reliable source and becomes more critical. in this era of generative AI.”
Earlier this year, Google introduced the Gemini-powered image generator. Users discovered historical inaccuracies that went viral online and the company pulled the feature and said it would relaunch it in the coming weeks. The feature has not yet been republished.
The tech giant will launch “AI Overviews” on Google Search on Monday in the US. AI Overviews show a quick summary of answers to the most complex search questions, according to Liz Reid, head of Search at Google. For example, if a user searches for the best way to clean leather boots, the results page might display an “AI Overview” at the top with a multi-step cleaning process, gleaned from the information you synthesized. in the web.
The company said it plans to introduce assistant-like planning capabilities directly within search. He explained that users will be able to search for something like “‘Create a 3-day meal plan for a group that’s easy to prepare'” and will get a starting point with a wide range of recipes from around the web. .
Regarding its progress to offer “multimodality,” or integrating more images and videos within generative AI tools, Google said it will begin testing the ability for users to ask questions through videos, such as filming a problem with a product you own, upload it and ask the search engine to solve the problem. In one example, Google showed someone filming a broken record player while asking why it didn’t work. Google Search found the model of the turntable and suggested that it might be malfunctioning because it was not properly balanced.
Another new feature being tested is called “AI Teammate,” which will be integrated into the user’s Google Workspace. You can create a searchable collection of works from messages and email threads with more PDF files and documents. For example, a future founder might ask their AI teammate, “Are we ready to launch?” and the assistant will provide an analysis and summary based on information you can access in Gmail, Google Docs, and other Workspace apps.
Project Astra is Google’s latest push toward its AI assistant that is being developed by Google’s DeepMind AI unit. It’s just a prototype for now, but you can think of it as Google’s goal of developing its own version of JARVIS, Tony Stark’s omniscient AI assistant from the Marvel Universe.
In the demo video presented at Google I/O, the assistant (via video and audio, rather than a chatbot interface) was able to help the user remember where they left their glasses, review the code, and answer questions about which part certain of a speaker is called, when that speaker was shown on video.
Google said that a truly useful chatbot should allow users to “speak to it naturally and without delays or delays.” The conversation in the demo video took place in real time, without delays. The demo followed OpenAI’s Monday presentation of a similar audio conversation with ChatGPT.
DeepMind CEO Demis Hassabis said on stage that “reducing the response time to something conversational is a difficult engineering challenge.”
Pichai said he expects Project Astra to launch on Gemini later this year.
Google also announced Trillium, its sixth-generation TPU or Tensor Processing Unit, an integral piece of hardware for running complex AI operations, which will be available to cloud customers in late 2024.
TPUs are not intended to compete with other chips, such as Nvidia’s graphics processing units. Pichai noted during I/O, for example, that Google Cloud will begin offering Nvidia’s Blackwell GPUs in early 2025.
Nvidia said in March that Google will use the Blackwell platform for “several internal deployments and will be one of the first cloud providers to offer Blackwell-powered instances,” and that access to Nvidia’s systems will help Google deliver tools at scale. For businesses. developers who build large language models.
In his speech, Pichai highlighted Google’s “long partnership with Nvidia.” The companies have been working together for more than a decade, and Pichai has said in the past that he expects them to continue doing so a decade from now.