Friday, June 21, 2024

Google unveils AI solutions — from disaster forecast to music creation

Artificial intelligence (AI) refers to computer systems that aim to accomplish tasks that previously required human intelligence and ingenuity. It seems the stuff of science fiction. Can computers compose original music, become video creators, and accurately translate Filipino into Swahili?

Google’s flood forecasting models allows the company to project where floods will occur and their severity

Maybe not now, but soon if Google has anything to say about it. Last Nov. 2, the technology behemoth held an event that revealed its latest AI innovations in disaster response, language inclusivity, as well as text, music, and video generation to show how this technology can support people in varied areas and industries.

AI and early flood warnings

AI modelling is the backbone of artificial intelligence. This process involves the development of machine learning algorithms that mimic logical decision making by training computer systems on available data.

Trained on historic rainfall records, river level readings, and data on the terrain and elevation of a specific area, Google’s flood forecasting models allows the company to project where floods will occur and their severity.

From its beginnings in a single, Indian region in 2018, this initiative has grown exponentially. In 2021, 115 million flood notifications over Google Search and Maps were sent to 23 million people in India and Bangladesh.

The company announced the expansion of this program to eighteen more countries at its recent AI event. Three of the countries are in Latin American and fifteen are in Africa. Unfortunately, there is no date for its roll-out in the Philippines. 

Google Engineering manager Sella Nevo stated though, “We really are on the verge of scaling up quite rapidly, which will mean a lot of our work will be less on prioritizing and more on just scaling as rapidly as we can to reach everywhere.”

Additionally, Google disclosed that its Wildfire Tracking System is now live the United States, Canada, Mexico, and Australia.

This system allows them to identify and track wildfires in real time and to predict how they will spread through models trained on satellite images. It is meant to provide support for first responders as well as keep civilians updated on the fires.

Thousand languages model

At its event, Google also made public its ambitious 1,000 languages initiative. This project intends to produce an AI model that will support the 1,000 most spoken languages in the world.

It is envisioned to improve the accuracy of Google products such as Translate, YouTube, and GBoard as well as make it easier for people to find relevant content over Search and use technology in their native language.

To ensure the accuracy of the data they are training their models on, Google is working directly with local partners from communities who speak the identified dialects and languages to collect representative audio samples.

Johan Schalkwyk, a Google Research Fellow, said “Google’s mission is to organize the world’s information and make it universally accessible… If we truly want to solve Google’s mission, we need to incorporate the full diversity of languages that are spoken across the world.”

AI for creatives

Even people untrained in art may soon be able to make it with AI. At their event, Google is training generative AI models to help people express themselves through writing, music, and imagery.

For instance, Google’s WordCraft model explores text generation for writing. During its event, Google showcased the short stories by professional writers who used Wordcraft as support for their work. For example, this model helped the writers set the mood or ensure a consistent voice in their characters.

On the other hand, AudioLM is a new framework for audio generation. Unlike most audio generation models, this pure audio model is trained without any text or symbolic representation of music. In other words, it can create its own realistic speech or music composition by riffing off an existing sample of speech or music.

The Google AI event also revealed research that tackled text to video generation. This area has been a challenging for AI developers because it is difficult for models to maintain the resolution of each frame and the video’s coherence in time. The latter property refers to the clarity of the storytelling in the video’s sequence of images.

They currently have two complementary models making headway on these issues.

First, Imagen Video uses diffusion to generate high resolution individual images to yield high quality short videos. Second, Phenaki uses a sequence learning technique that generates a series of tokens over time to produce coherent, long-form videos.

To show users the best of both worlds — high resolution frames and coherence in time — the Imagen Video and Phenaki Google teams collaborated.

During the event, they exhibited samples of what AI-generated high-resolution videos could look like if these two models were combined.

In the future, user’s may even test these AI models on the Google app, AI Test Kitchen. Imagined as the place to learn about, experience, and give feedback on Google’s emerging Generative AI technology, this app will feature a rotating selection of AI models for users to try.

“Unlike the technologies in the past, I think the real wins we have seen have been more about user control and user agency than it has been about the fidelity of the images or the pieces that are being generated,” said Douglas Eck, a Google Research senior researcher director.

Developing responsibly

While the generative AI models are powerful and can support human expression, there are also risks that Google wants to address before releasing this technology.

For example, the productive capabilities of generative AI can lead to loss of trust in media and news as malicious actors utilize them to increase disinformation dissemination and biased media generation.

Additionally, Generative AI models can impact the livelihood of creatives like writers, musicians, and video editors.

Google is following its AI principles, developing other tools to detect fake media, and building safety measures into the use of its AI models to guard against the generation of harmful images to minimize the risks that come with AI’s advancement.


- Advertisement -spot_img




- Advertisement -spot_img