When AI Is Making Music, Where Do the Humans Fit In?

By wazup on June 1, 2023 Comments Closed / 84 views

Smile was supposed to be the Beach Boys’ masterpiece. Their 12th album would be a “teenage symphony to God,” Brian Wilson claimed, that would make Pet Sounds sound like a demo. It was slated for release in 1967, a 12-track experimental LP filled with tape edits, spoken word, sound effects, complex vocal arrangements, and even comedy skits. And after more than 50 hours of recording sessions, various legal complications, obsessive tinkering, mental breakdowns, and mounting expectations, it never came out.

A few original Smile tracks have been officially released in the decades since, and Wilson rerecorded the music for a 2004 project, but the official version remains one of the most legendary unreleased albums in rock history. Last month, one fan ventured to answer the questions “What if Wilson finished the job? What if he worked out the technical difficulties, the legal issues, the mental distress? What if Wilson’s human limitations had an AI solution?” To “complete” Smile, Mike LeRoy used AI to generate Wilson-style vocals for the tracks that never featured them. “If you turn off your mind and, perhaps, sing along, it really can feel like Brian sang on these,” LeRoy wrote in his liner notes. “A special feeling I’m blessed to have been a part of making happen.”

The response to AI Smile seems in line with the popular sentiment around AI art at large—a mix of fear, intrigue, and a little shame. “This is aesthetically beautiful and morally confusing,” one person commented on Twitter. It’s also how the internet voices felt about “Heart on My Sleeve,” an AI-generated song made to sound like a collab between Drake and the Weeknd, released by TikTok user @ghostwriter977 in April.

We’ve seen “deepfakes” manipulate the faces of politicians and celebrities. They’ve ranged from mildly funny (Joe Biden praising We Bought a Zoo, Kim Kardashian doing pranks) to mildly concerning (the RNC’s dystopian AI-generated ad showing a “look into the country’s possible future if Joe Biden is re-elected,” Emma Watson reading Mein Kampf). Over the past six months, user-friendly AI generators like ChatGPT and Midjourney have made writing sample book proposals or illustrating watercolor self-portraits as easy as navigating plug-and-play games. Song production has become another AI party trick as programs like Boomy and Voicemod have become more available. Generative AI in music isn’t new, but it certainly gets more airtime when Drake is a casualty.

Regardless of fans’ feelings about the Drakefake, the song had to come down. Universal Music Group had “Heart on My Sleeve” removed from streaming services, TikTok, and YouTube shortly after it was uploaded, citing copyright infringement. The music publisher said that platforms had a “legal and ethical responsibility to prevent the use of their services in ways that harm artists.” On Spotify—The Ringer’s parent company—the song was streamed 629,439 times before it was pulled on April 18.

You can still find rips of “Heart on My Sleeve” online, alongside countless Drakefakes that mold his voice to lyrics about complicated women, needing to focus on himself, and other classically Drake tropes. You can also find a cover of “Somebody I Used to Know” using Kanye West’s and Playboi Carti’s voices. You can find an approximation of Ariana Grande’s vocals covering Dua Lipa. You can find an AI-generated reggaeton duet by Bad Bunny and Rihanna. You can find an AI-based album featuring “new music” from Oasis. The Drakefakes, though, are by far the most realistic. Drake is imitable to a fault—his signature softboy cadence, a consistent BPM and flow, choruses about texting. Drake goes down easy; he’s the ideal artist to copy.

In an increasingly robotic entertainment landscape, covers and carbon copies are a safe bet. The content has already been tested and approved by the masses. Listeners no longer have to take chances on music they might not enjoy; we don’t have to endure a single second of sonic discomfort. But when redundancy is rewarded and everything is a replica, how does original creativity factor into the artistic equation?

It’s never been a better time to be a generic pop star. Content feeds and mood playlists algorithmically group similar sounds and styles, and if your music doesn’t fall into a group, you might be on your own. TikTok, now a main music discovery platform, rewards consistency, quantity, and following the trends set by the most viral creators. Meanwhile, users across social and music platforms prioritize what feels familiar and aligns with their tastes. The popularity of these deepfakes speaks to the power of that familiarity and sheer content volume. Cherie Hu, founder of the music and tech research company Water & Music, believes Drake’s predictability is an essential part of his strategy and a big part of what makes “Heart on My Sleeve” possible. “Maybe that says more about the state of pop music now, or mainstream music in general,” she said.

The magic of pop music has long been based on formula—redundant hooks, catchy choruses, danceable rhythms, four boyish men singing about wanting to hold a girl’s hand. But efforts to further reduce pop to a simple, lucrative equation have grown over time. Max Martin has been heralded as the most important songwriter and producer of the past few decades. His formula is responsible for the sounds of chart-topping icons like Britney Spears, Taylor Swift, Katy Perry, and every artist who has emulated them since. Martin’s recipe highlights earworms, melodies repeated until they burn into your brain, with no room for improv or flow. Martin doesn’t cook, he bakes, and nailing the measurements is crucial in his quest for the perfect cake. In the streaming era, adhering to formula is doubly important. Creativity under AI isn’t dissimilar to creativity in the age of streaming, or creativity under capitalism and the attention economy. These conditions are all pointing art in the same direction.

“If you want to make money from [streaming], you have to be a lot more generic,” Midia Research music industry analyst Tatiana Cirisano said. “For that reason, popular music has gotten a bit homogeneous. … Streaming rewards consistency more than it rewards being occasionally great.”

With so much content consistently uploaded and shoved in our faces—Spotify itself sees 100,000 new tracks every day—it’s understandable that we find comfort in the familiar. That’s what makes these deepfakes so seductive and potentially dangerous. They can be recognizable and novel at once; you know the sound of Drake’s voice, but I bet you’ve never heard his robot twin sing “Bubbly” by Colbie Caillat. Nolan Gasser, a musicologist and the architect behind Pandora Radio’s analysis and recommendation system, reasons that Drake fans are inclined to like anything that sounds like Drake, even if they know it’s fake. “Your brain knows that it can make predictions about it,” he said. “And when the brain can make predictions, it tends to like things more.”

Consider sampling: Today it’s considered an art form—and it’s a reliable chart driver, with songs like Jack Harlow’s “First Class” recycling past hits—but it wasn’t always accepted, let alone celebrated. It’s gone through phases, once looked down on as dishonest or derivative, then lauded as a collage or a fresh take on vintage art. There were fears that sampling would kill creativity. (In 1997, The New York Times published the headline “Sampling Is (a) Creative or (b) Theft?”) People worried that digital recordings would deprive music of a human touch. Eventually, the industry came around to sampling as another revenue source. “Sampling, in the beginning, was like an outrage,” Gasser said. “And then people started making money off of sampling. And now there are websites that allow you to license samples.” It’s easy to imagine a world where generative AI is treated like sampling, slowly shedding its stigma and questions of ethics or artistic merit, yet still seen as precarious from a legal perspective. Cirisano can see this technology being ushered in as the “next step of sampling,” just another artistic outlet. That said, she also senses that the legal “Whac-A-Mole” that record labels and publishers often play when tracking down uses of their artists’ samples won’t be sustainable for AI.

Pinning down the rights around an artist’s sonic likeness is tricky. As it stands, “transformative” content, like a parody, is considered fair use. However, determining whether a work is “transformative” of its source material versus an outright rip-off can be tricky. “It’s also the fact that you’re passing off the personality rights of Drake in his voice,” AI music expert Dr. Martin Clancy said. “And that’s a situation that [we’ve] never really been faced with before.”

Without copyright laws specifically for generative AI in place, labels are trying to carve out rules and protections in real time, as their artists’ voices are copied and manipulated across the internet and the tech evolves by the minute. The understanding limiting the use of AI is that these deepfakes were trained on recordings of the artists they’re imitating, but there’s no real way to prove or protect that. (Last month, UMG asked streaming services like Spotify and Apple Music to prevent AI companies from using its music to train their tech.) One potential starter solution, Cirisano mentioned, would be an edict for generative AI models to train solely on pre-authorized datasets, giving labels and publishers the opportunity to license out certain music. “I think trying to just issue a takedown every time somebody puts up an AI-generated Drake track is not going to work,” she said. “I think the music industry is going to have to figure out a way to work with this and get paid accordingly.”

But the people making AI deepfakes aren’t thinking about ethics or legality. They’re doing it because it’s fun. Cirisano said music has become “consumerified” with the advent of TikTok and the creator economy. (For proof, look no further than the sped-up songs edit trend that has swept TikTok over the past 12 months.) People want to interact with a song, add a verse, make it their own. “Deepfake technology feels like the natural next step of this remix culture era, where everyone is constantly iterating on each other’s creations,” she said. “It’s just accelerating trends that have already been happening.”

Whether it’s “good” that someone finished a forgotten Beach Boys project is debatable. And for every doomsday scenario, there are also glimmers of hope for AI’s potential beyond deepfakes. AI can do, it can produce, but it can’t feel or reflect. Even as the tech has become more autonomous, human intervention retains a crucial role.

Cirisano said AI can be a powerful way for artists to engage with fans, especially during a time when user-generated content is so valuable. Hu mentioned how AI could carry out an artist’s legacy, perhaps used by the artist to create music after their voice has aged and changed. (The jury’s still out on using AI to resurrect dead pop stars, though Timbaland did finally get his Biggie collaboration.) As conversations around AI get louder, participating in the discourse—or being at the center of it—seems like a positive.

From a publicity perspective, the Drakefakes might actually work in Drake’s favor. His name made headlines for weeks following the “Heart on My Sleeve” drama. Google Trends shows that searches for “Drake” were trending higher after the deepfake’s release. “This is ultimately a very good thing for Drake. If there’s mass interest and content around this artist’s voice, people are going to be really thirsty for the real thing,” Jesse Kirshbaum, CEO at Nue Agency, said. “He’s in more people’s psyches. He’s basically growth hacking.”

According to Parrot Analytics, Drake’s global talent demand (determined by online consumption, engagement, and searches surrounding Drake) increased by 63.6 percent between April 12 and 18. “Heart on My Sleeve” went viral the weekend of April 16. Luminate Data, meanwhile, shows that Drake’s on-demand streams and album sales (including streaming and digital song sales equivalents) were down -6.5 and -17.6 percent, respectively, during the chart week beginning April 14 and ending April 20 in the U.S.

Once the novelty of fake Drake wears off, though, fans will crave the real thing, and any new music he releases will carry more value. Fans ultimately want the real Drake, his imperfect, authentic self, his human vulnerability. “AI certainly is not pulling from emotion. The AI is not evaluating where the emotional payoff will be. It’s really calculating the probability of where the music would go next,” Gasser said.

The speed and preciseness with which AI can pop out a Drake song is, on its face, disconcerting. If exploited, the tech could put added pressure on artists to continue ramping up their output at the price of organic creativity. But an optimistic, potentially naive bright side could see artists driven to make music that’s uncopyable, working against the market logic of quantity and uniformity, opting for something truly unique that can’t be rolled out and reproduced on a conveyor belt.

“The artists that will stand out in the future, or the ones that stand on their own outside of this AI-generated arena, will be the ones who are constantly reinventing themselves and are unpredictable, and are inimitable,” Cirisano said. “So I think it will actually push creativity forward, which is a good thing in the end.”

Some artists see AI as a tool for furthering artistic expression or crafting something completely new, something they can imbue with their own humanity and emotion. New research from music distribution company Ditto Music revealed that 59.5 percent of artists are already using AI to create music, while 47 percent are interested in using AI for songwriting in the future. Midia Research’s annual survey of online creators found that half of respondents either somewhat or strongly agreed that AI can be a useful tool for making music. A quarter were neutral. “That kind of shows that these fears about AI might be coming more from the record label and other stakeholder perspective than from the artists themselves,” Cirisano said. WaveAI was founded back in 2017 on the premise that its generative AI can help artists write music and lyrics by letting them set the parameters (like choosing “love” as the song’s topic or G minor as its key) and then collaborate with the tech to fill in the blanks, as if they have a sci-fi songwriting partner.

Experimental composer Holly Herndon was composing complex AI musical personas years before ChatGPT was a household name. Herndon uses AI models and computer-generated instruments and vocal processes to craft sounds that she could never make with her body alone. “It’s never a bad thing to give an artist more tools to express what is on their mind,” she said. “A few special artists will find ways to unlock something genuinely new.” Her 2019 album, Proto, involved a singing AI named Spawn, an artificial neural network trained to recognize and replicate human voices. Spawn learned to create original music from Herndon and her collaborators feeding it audio files of diverse singing voices (including Herndon’s own) and then holding live performances where participants sang to it, which they called “training ceremonies.”

In 2021, Herndon released Holly+, a deepfake version of Herndon’s voice that can reinterpret and perform other artists’ songs. All you have to do is upload audio to the website holly.plus, and Herndon’s digital twin will remake it and spit it back out. “I created Holly+ a few years ago to demonstrate how consent could work in a world where it will be impossible to stop someone from creating something with your style and likeness.” Herndon said. “I attempted to decentralize my voice and identity.” Herndon used her personality rights to officially approve songs created with her voice and offered a 50 percent producer split to producers spawning works with her voice. Recently, Grimes made the same proposal, announcing that she would offer a 50 percent royalty split to anyone who creates a successful track using her AI-generated vocals. She, too, is a longtime proponent of the tech, having crafted AI lullabies and meditations before debuting her very own generative software, Elf.tech, in May.

Cocomposing works with computers has been an enduring fascination in music. In 1956, composer Lejaren Hiller and composer-slash-mathematician Leonard Isaacson made “Illiac Suite,” the first original piece composed by a “supercomputer” that generated random numbers corresponding to features like pitch or rhythm. Since then, inventors have been working toward more autonomous systems with built-in knowledge bases. In 1965, Ray Kurzweil debuted a computer-generated piano piece made by a system that could analyze and use patterns from compositions to create new songs.

Author, composer, and scientist David Cope developed the proto-deepfake in the ’90s with his Experiments in Musical Intelligence. When an artist’s work was fed into Cope’s computer program, the software could recognize patterns and generate new compositions that mirrored the original’s style. In 1997, Cope tasked EMI with composing a piece of music that imitated Bach. An audience found its rendering more convincing than a Bach copycat piece created by a human. Also in the ’90s, David Bowie began writing lyrics alongside a text randomization software dubbed the Verbasizer. More modern, still semi-retro iterations of AI music include Jukedeck, a 2012 website that let people generate royalty-free music for videos. In 2017, Google came out with NSynth, an open-source hardware instrument used by artists like Grimes and the experimental pop band Yacht.

Claire L. Evans, a tech writer and one-third of Yacht, said her band decided to incorporate the tech into their process because they had the sense that AI would be important and they wanted to understand it. The AI tech of the time wasn’t easy to use—unlike the plug-and-play software musicians now have access to—and they took it on as a creative challenge. The trio combined a variety of AI processes to compose and write lyrics for their 2019 album, Chain Tripping, and to create the associated artwork, videos, typography, and press photos. While the album was successful, to Evans, the project reinstilled the importance of spontaneity and humanity in music.

“We had this sense that we would discover some underlying pattern or a process that could be reproducible, that could help us understand what makes a Yacht song. What’s the code behind this thing that we’ve been doing for years,” Evans wondered. “But what defines your algorithm are these decisions that you make in the moment. They’re so dependent on all these external factors and are completely dynamic and have to do with your own experiences, the people around you, what feels fun.”

AI will never have the distinct human experiences, relationships, emotions, and musings that make people care about and feel connected to music. For the artist whose job it is to showcase personality and innovate beyond algorithms, AI shouldn’t be a threat. AI can remix and regurgitate, but its “innovation” is limited to data that already exists. Perhaps the more concerning element is what could happen to the artists with more technical work. The less visible creative forces—session musicians, backing bands, artists who make soundtracks for video games or commercials—are more at risk.

“It is very rare to encounter something inimitable, and AI will make that accomplishment even more precious,” Herndon reflected. “When everyone can spawn AI music, some people will still be more magnetic for various reasons, and others will stand out by rejecting it entirely, or by attempting things that others wouldn’t think of.”

On one hand, AI potentially lowers the barriers for new artists entering the ring. On the other, content overflow puts artists in a worse position because it makes it harder for listeners to find their work, harder for musicians to break through the noise and stand out in a sea of AI. Spotify recently removed 7 percent of songs created with the AI-generated music service Boomy, equal to “tens of thousands,” after UMG flagged Boomy for reportedly using artificial bot listeners to boost streaming numbers and then cashing out on the fake “fans.” Boomy, which claims its users have created more than 14 million songs, helps them release their AI-generated songs and albums on streaming services while taking a cut of the royalty distribution fees. The company offers users the ability to auto-generate music based on inputs like genre, style, and even use case. You can make a lo-fi hip-hop song intended for afternoon naps, or an ambient EDM track to energize you in the morning.

But if Spotify were to, say, create its own AI songs and playlists, that could be a moneymaker for the platform and a further detriment to real artists. “Music is really expensive. It costs Spotify money every time someone plays an album that was written by human beings. It seems that the final manifestation of both of these trends, our willingness or necessity to adapt music to platforms and the platform’s desire to cut costs, is that generative AI will be used to produce plausible-sounding background music to maximize playlists,” Evans said.

Streaming has fundamentally changed how we listen to music. “Through streaming we learned that a significant number of people just want something pleasing to listen to while they are jogging, and I think bots will be able to fulfill that function,” Herndon predicted. “Perhaps AI will serve as a forcing function for us to distinguish between artists who strive to produce art, and people who strive to produce content.” Just this week, UMG entered a partnership with the generative AI music startup Endel that aims to help its artists make “science-backed soundscapes” that “enhance audience wellness.”

The conversation around AI has ramped up for creative workers across the board in recent months. With the current Writers Guild of America strike, one main issue is the union’s aim to regulate the use of AI scripts and content. The guild is asking the studios for protections against being replaced by AI or having their work used to train AI models. The studios responded, basically, that they can’t promise anything. There’s clearly ambivalence about innovation from decision-makers, so the future might depend on consumer judgment, whether viewers will give their dollars and eyeballs over to AI programming and make rich execs richer or whether a collective desire for human stories will prosper. AI could whip up a convincing Friends remake, I bet. It could even make a Friends and Frasier mash-up. It could probably survey data on viewers’ favorite plot points and Frankenstein an enjoyable sitcom episode. But while the modern viewer craves nostalgia and the content they already know and love, the modern viewer also bores easily. People will want new things. How many remakes can you remake? How many Drakes does one culture need? (Although there are people out there on their 50th Friends rewatch.)

Yacht now calls itself a “post-AI band,” letting its pendulum swing in the opposite direction. “We’re more interested in just making extremely human, weird stuff … for the love of it. That’s all we have left,” Evans said. “[With AI], we were able to make music that was really different from anything that we ever made before. It allowed us to think differently and make different kinds of art. And I’m grateful for that moment. But at the same time, the only thing I wanted to do after we were done with that record was jam and make body music, very simple dance music with bass and drums and voice. Because it felt new again.”

Julia Gray is a Brooklyn-based music and culture writer. Her work has appeared in places like The Washington Post, Playboy, Pitchfork, and Stereogum. She makes chaotic tweets at @juliagrayok.

music