Skip to content Skip to sidebar Skip to footer

OpenAI Unveils Real-Time Video Analysis Feature for ChatGPT

OpenAI has unveiled the video capabilities of its ChatGPT, allowing users to use their phones for real-time AI analysis of objects. This feature, which was first demonstrated in May, enables GPT to interactively provide feedback while observing users in real time. In tests, the mode has successfully performed tasks such as solving math problems, providing food recipes, narrating stories, and even engaging in interactive play with children.

This development comes on the heels of Google’s demonstration of its AI assistant, powered by the recently developed Gemini 2.0, which also boasts camera-enabled capabilities. Meta has also entered the competition with its own AI that uses phone cameras for visual interaction and communication.

However, the new video feature of ChatGPT is not universally accessible. It is available only to Plus, Team, and Pro subscribers, under what OpenAI refers to as “Advanced Voice Mode with vision.” The Plus subscription is priced at $20 per month, while the Pro tier goes for $200.

Kevin Weil, OpenAI’s Chief Product Officer, spoke about the new feature during a live stream, stating, “We’re excited to announce that we’re bringing video to Advanced voice mode so you can bring live video and also live screen sharing into your conversations with ChatGPT.”

This announcement is part of OpenAI’s “12 Days of OpenAI” campaign, which promises 12 consecutive days of announcements. Up to this point, OpenAI has announced the launch of its o1 model for all users, the ChatGPT Pro plan, reinforcement fine-tuning for customized models, the generative video app Sora, an update to its canvas feature, and the release of ChatGPT to Apple devices through the tech giant’s Apple Intelligence feature.

Despite the excitement surrounding the new capabilities, the journey has not been without challenges. The release of these features was postponed following a controversy in which OpenAI mimicked actress Scarlett Johansson’s voice without her consent in advanced voice mode. The delay was also attributed to the fact that the video mode relies on the advanced voice mode.

Meanwhile, Google and Meta are not sitting on the sidelines. Google’s Project Astra, with similar features, has been handed to “trusted testers” on Android. Google plans to roll out the feature widely early next year and has far-reaching plans to enable its AI models to execute tasks in real time. Similarly, Meta’s assistant, Meta AI, offers low-latency responses and real-time video understanding, with the company planning to leverage augmented reality for its AI offerings.

ChatGPT Plus users can try out the new video features by tapping the voice icon next to the chat bar and then hitting the video button. Screen sharing requires an additional tap through the three-dot menu.

Enterprise and Edu ChatGPT users will be able to try out the new video features from January. EU subscribers, however, will have to wait a little longer to access these features.