Google Unveils Gemini AI's New Capabilities Ahead of I/O 2024

“2001: A Space Odyssey” is over 50 years old now, yet its talking AI, HAL 9000, remains iconic. In contrast, real-world talking AIs like Siri, Alexa, and Google Assistant, while useful, often fall short of seeming truly intelligent. However, this is rapidly changing.

Google is gearing up for its annual I/O conference (livestream starts at 17:00 UTC) and has teased exciting advancements – its Gemini AI has acquired impressive new capabilities.

Gemini is a “multi-modal” AI, meaning it can seamlessly integrate text, audio, and imagery. This enables it to operate on a Pixel phone, using its camera to understand and discuss the environment. Watch the brief demo below:

In case you missed it, OpenAI’s GPT-4 demo showcases similar abilities, with an AI engaging in real-time conversations about its surroundings. A competitive landscape is emerging, especially as Apple plans to incorporate ChatGPT into iOS 18, while Samsung, Oppo, and OnePlus have adopted Gemini.

What does this mean for Google Assistant? It’s uncertain – Google I/O might mark its transition to Gemini, or they might coexist for a while as Gemini still lacks some functionality (as noted by commenters on the livestream post). However, Google seems poised to announce significant updates to Gemini, potentially filling these gaps.

Another point of interest – the Pixel 8 is expected to feature on-device Gemini Nano (like the Pro), and the Pixel 9 series will likely have even greater computational power. How much of the Gemini demo runs on-device versus in the cloud remains a question.

This year’s Google I/O promises to be unmissable. In addition to AI advancements, Google will launch Android 15 and likely discuss AI assistant integration, satellite communication, and other innovations that have emerged since I/O 2023.