TikTok’s tools for adding music to short videos helped turn short-form video into a phenomenon. Now Google is giving some YouTube Shorts creators an AI feature called Dream Track that can generate songs, including lyrics, melody, and accompaniment, in the styles of seven different artists including Charlie Puth, Demi Lovato, Sia, and T-Pain with a tool called Dream Track.
To whip up a 30-second clip with Dream Track a creator just has to enter a prompt, such as “a ballad about how opposites attract, upbeat acoustic,” then select which artist the song should be styled on.
The new AI capabilities might help Google lure users from TikTok, where AI tools for adding visual or audio effects are hugely popular. YouTube says it is looking into how artists whose work helped train its music-generating algorithms will receive a cut of future ad revenue generated by videos featuring AI-generated audio. That would represent a test of a novel way for artists to profit from AI built in part on their work.
Dream Track uses an AI algorithm called Lyria developed by Google Deepmind, the unit charged with keeping the company at the cutting edge of AI. YouTube’s global head of music, veteran music mogul Lyor Cohen, who helped launch the careers of artists including Public Enemy, Run-DMC, and the Beastie Boys, told WIRED on Wednesday that he was blown away after hearing a demo of its output at Google DeepMind’s London headquarters in May. “I knew we not only had something unique and special, but something that I believed that the music industry would dig and want to work with,” Cohen says.
Cohen says the seven artists who opted to let Dream Track replicate their styles did so out of a desire to embrace generative AI on their terms. “Our partners, many of whom lived the Napster days, didn’t want to play defense, they wanted to play offense, and they were excited about the possibilities,” he says. In August, YouTube announced that it was creating an incubator to engage with artists on ways of using generative AI.
Lyria has also been used to build a second tool announced today called Music AI, which lets artists in YouTube’s incubator program conjure, remix, and modify tracks in new ways. Demis Hassabis, Google DeepMind CEO, says that this software can automatically convert a song from one genre to another—say from hip hop to country. It can also generate a full instrumental melody and backing track from a whistled tune, and convert abstract text input such as “sunshine” into a musical interpretation.
Hassabis says that this last trick is a good example of the kind of “multimodal” AI capabilities that powerful models increasingly exhibit. The latest version of OpenAI’s ChatGPT can work with audio and images in addition to text. Google DeepMind is developing a powerful AI model of its own, called Gemini, that is rumored to have multimodal capabilities.
The recent proliferation of AI tools capable of creating images, passages of text, and music has sparked protest from some artists and authors who feel that the inclusion of their work in AI systems’ training data without permission or payment is unfair. A growing movement involves blocking companies from scraping web content to feed to generative AI programs or trying to have copyrighted material removed from common datasets.
Some musicians are embracing the AI revolution despite such issues. The artist Grimes told WIRED recently that she plans to open source her musical persona so that anyone can replicate her style with AI.
The musicians involved with YouTube’s latest AI experiments seem less troubled, no doubt because they have some say about how their work is being repurposed—and may see a cut of the spoils in time. “I’m extremely excited and inspired by the realm of musical possibilities that come from allowing the human mind to collaborate with the nonhuman mind,” a statement from Charlie Puth says. “I am open-minded and hopeful that this experiment with Google and YouTube will be a positive and enlightening experience,” says another from Demi Lovato.
Google says it is using a technology called Synth-ID to add watermarks inaudible to the human ear to music generated using Lyria so that it can be identified as such. The company says Lyria was trained on “a broad set of music content,” so it will be interesting if it can figure out a way to credit every artist who contributed.