In the appearance of a hadith on maybeBodcast hosted the founder of LinkedIn, Red Hoffman, CEO of Google DeepMind, that Google ultimately plans to integrate it in the end twin Artificial intelligence models with it Po Video generation models to improve the previous understanding of the material world.
“We have always built Gemini, our basic model, to be multimedia from the beginning, and the reason we did this (because of) we have a vision of this idea of a global digital assistant, and an auxiliary assistant actually helps you in the real world.”
The artificial intelligence industry moves gradually towards “omni” models, if you will – models that can understand and synthesize many forms of media. The latest Gemini models in Google Voice generation In addition to pictures and text, while Openai’s default model in Chatgpt can create pictures – including, of course, Ghibli-style art studio. The Amazon has I also announced plans To launch a “for anyone” model later this year.
These models require a lot of training data – photos, videos, sound, text, etc. Hassabis implicitly that VEO’s video data mostly comes from YouTube, a Google platform.
“Basically, by watching YouTube videos – a lot of YouTube videos – (Veo 2) can discover, as you know, world physics,” said Hassabis.
Google was told by Techcrunch its models “might be” trained on “some” YouTube content according to its agreement with YouTube. It is said, Google expanded its service conditions Last year, partly to allow the company to click on more data to train artificial intelligence models.
https://techcrunch.com/wp-content/uploads/2024/12/GettyImages-2151467531.jpg?resize=1200,800
Source link