Soul of the Machine

AI knows too much—and too little—to be a great songwriter.

Mar 19, 2024

The French composer Camille Saint-Saëns, who was born in 1835 and lived until 1921, is believed by many scholars to have been the greatest musical prodigy of all time. Before he turned three, he was recognized to have perfect pitch, and made his concert debut at the age of ten, performing piano concertos by Mozart and Beethoven, later becoming one of the most celebrated organists of the nineteenth century. As a composer, he rapidly assimilated all with which he came into contact, drawing the attention of Giochino Rossini, Franz Liszt, and Hector Berlioz. But for all of his facility, he never developed a truly distinct voice. As Berlioz is alleged to have quipped, "Il sait tout, mais il manque d’inexpérience.” (“He knows everything, but lacks inexperience.”)

I thought of Saint-Saëns, and other musicians like him, when I read earlier this week about Suno, a tech start-up hailed as “ChatGPT for music,” and which, per its web site, is building a future in which “anyone can make great music. Whether you're a shower singer or a charting artist, we break barriers between you and the song you dream of making. No instrument needed, just imagination. From your mind to music.”

I have so many questions. But first, let’s witness Suno in action, via a weirdly fawning essay published this week in Rolling Stone:

“I’m just a soul trapped in this circuitry.” The voice singing those lyrics is raw and plaintive, dipping into blue notes. A lone acoustic guitar chugs behind it, punctuating the vocal phrases with tasteful runs. But there’s no human behind the voice, no hands on that guitar. There is, in fact, no guitar. In the space of 15 seconds, this credible, even moving, blues song was generated by the latest AI model from a startup named Suno. All it took to summon it from the void was a simple text prompt: “solo acoustic Mississippi Delta blues about a sad AI.” To be maximally precise, the song is the work of two AI models in collaboration: Suno’s model creates all the music itself, while calling on OpenAI’s ChatGPT to generate the lyrics and even a title: “Soul of the Machine.”

Reader, the song is neither credible nor moving. But this is beside the point. (You can listen to it for yourself, here.) What interests me is how little the proponents of this technology, and other technologies like it, seem to understand what makes art authentic. For the ability to mimic makes not for original art, but for a party trick. Still, the problems with AI art run much deeper.

In the last few years, as the prospect of machine-generated music has inched closer to reality, the standard riposte among skeptics has largely hewed to the notion that no matter how complex, compelling, or refined AI songs might become, the absence of a human creator will be an obstacle to enjoyment or emotional catharsis. Listeners are keenly connected not only to the song, but to the songwriter. Our emotional response to a work of art has to do, in part, with a kind of triangulation between 1) the emotional life of the creator, 2) the song (or other art object) they’ve made, and 3) our own life. Great songwriters render the personal in such a way that it becomes universal, and vice versa.

I think this argument is correct, if incomplete. For the whole premise of AI making meaningful music—or any art, for that matter—is predicated on a category error: that is, the belief that having a limitless repository of extant art from which to draw will result in the creation of new and authentic songs, plays, or poems. I would argue instead that great artists create their work by responding to or synthesizing their necessarily limited experiences—aesthetic, personal, political, or otherwise—in such a way that a singular point of view emerges. In other words, if AI is going to compete with humans in the realm of art-making, the size of its data sets will not be the determining factor.

Indeed, the prospect of AI making music doesn’t concern me—and I realize I may someday regret having been so cavalier—because it is precisely the limitations of what an artist has touched or been touched by that give rise to his voice. That is to say, it’s our particular adjacencies—say, my encounter as a teenager with Alban Berg’s Piano Sonata, but also with the original version of Dylan’s “If You See Her Say Hello” on a mixtape made by my ex-girlfriend, as well as my longstanding admiration for the poetry of Anne Carson and the prose of W.G. Sebald—that makes someone sound like himself. What distinguishes an artist with a point of view from a mere peddler of pastiche is the ability to refract what he has experienced in a singular way.1

The belief that feeding vast data sets to a computer could result in the creation of great art brings me back to figures like Saint-Saëns, the archetypal assimilator, who, with preternatural ease, could write in any style. But for all of their boundless facility, sometimes these spectacularly impressive artists don’t have much to say. I have often wondered if the ability of artists like these to rapidly apprehend musical information, and the inability to speak in an original voice, are two sides of the same coin. It is as if the circuit in the brain which permits prodigious assimilation is almost too porous. What crosses the borderland of the ear and spirit enters too rapidly to be assessed aesthetically or morally, leading to an undigested library of sound. In such cases, no amount of facility can mask the void at the center: the absence of voice, or, at other times, the absence of taste.

Could AI embrace limitations of cultural encounter and knowledge in order to appear more authentic? Maybe. But it’s not only what we’ve encountered, but the way in which we encounter it, that creates an artist’s point of view. Then we must ask, could an AI be programmed not only to synthesize a specific set of cultural artifacts, but respond to, and incorporate, the context surrounding those artifacts in a virtual life? Perhaps. But again, even with boundless technical mastery, the work will necessarily lack an authentic aura, because it depends upon code written in the past, rather than a lived process.

By contrast, an artist’s struggle can serve as a portal to revelation. I don’t mean “struggle” in the depression/substance abuse/daddy issues sense of the word, but rather as it pertains to technical mastery. When an artist rubs up against the limits of her craft—the gap between what she hears and what she can express—a new creative realm may reveal itself. Sometimes, the artist powers through the limitation, transcending what she thought she was capable of. At other times, a resourceful artist is able to make use of the limitation as it exists, finding another route to say the thing which seemed unsayable. Another way of saying this: art is an expression of process. If you strip art of its process, it ceases to be art.

I sometimes wonder if my experience of emotional catharsis as a listener is connected to a parallel, epiphanic experience on the part of the creator. In other words, is the chord or lyric that brings me to tears moving not only because of its structural or narrative function, but because its arrival marks a moment of discovery on the part of its creator, a revelation whose aura is encased in the work like an insect in amber? Is not the emotional aura of a work of art (following Walter Benjamin) inextricably linked to the time and place and state of being in which it was created?

If Benjamin believed that the age of mechanical reproduction witnessed the “withering” of the aura of the work of art, and that “by making many reproductions it substitutes a plurality of copies for a unique existence,” what becomes of the aura of those artworks which are now stripped for parts by an AI whose raison d’être is to delude people into the belief that typing the phrase “sad shoe-gaze song about the lack of good fried chicken options in duluth,” and then listening to the result is an act of creativity?

There is something not only delusional but vampiric about the premise and promise of Suno. The notion that users might, with the assistance of AI, “express” themselves by mining a library of music from the past dynamites the delicate relationship between an artist and the living canon that surrounds her. Recent copyright wars notwithstanding, most artists understand that they are engaged in an intertextual conversation, across time and disciplines, with other artists. This unwritten contract suggests that when we make something, we recognize that another artist might respond to it, and in so doing, might repurpose some aspect of the original.

When the composer Dan Trueman used my song, “Baltimore,” as the jumping off point for his composition “Tallboy,” from a larger work called Trio, I perceived his use of those materials not as an act of exploitation, but as one of respect. More than that, I saw it as a piece of a larger, polyphonic dialogue between so many artists, one that speaks, resoundingly, across centuries. But this is a tender contract, one rooted in intimate, if unarticulated exchanges between individual artists. It is a contract deserving of trust.

When we repurpose the work of another, it is as if we are whispering to them, “I will take care of the thing you’ve made.” In the unspoken agreement that accompanies the loan of material, we commit to a deep study of the thing we’ve borrowed. We will learn to see and hear and touch it from as many angles as we can. When it becomes a part of our own work, if it is incorporated with respect, it will gain a new aura, because we will necessarily see it through our own eyes, ears, and experience.

What an AI does, by contrast, is necessarily derivative. It has not asked permission before pillaging the work of others. I am not even speaking here of economic exploitation—I have no idea what the ultimate business model of a company like Suno is, nor do I particularly care—but of the violation of that fragile understanding between artists, the beloved community wherein our voices grow more powerful, more distinct, through our study and reuse of the work of others.

Returning to Suno’s mission statement, which, again, promises to “break the barrier between you and the song you dream of making,” I can’t help but thinking of this as one more blow in late capitalism’s assault on anything and everything that requires time and patience. Suno seeks not only to remove all friction from the creative process, but ultimately, to do away with the creative process altogether. On the one hand, I suppose it’s only natural that automation, and the human labor it seeks to eliminate, would eventually come for the arts. But it strikes me as naive on the part of Suno’s founders to believe that its users will feel any legitimate sense of satisfaction or authorship over songs that bear no trace of their own labor, of the time and place in which they were made.

This brings me to a final point, which has to do with the fact that, as my friend Srinija Srinivasan said in a recent interview,

when we're dealing with an AI, it is in the past. Data science is by definition in the past. Data are things that have already occurred. This can be extremely useful, as it can build the basis for forming a far better shared understanding of our past. But to host a future is different. If we seek a future that goes beyond tinkering around iterations of the past, if we want to come up with anything altogether new, that will not come from AI.

If we are concerned that AI will generate music that will supplant mortal, human artists, then this is more an indictment of our taste and values around art than it is an indication of AI’s creative power. A society that is content with a world suffused with pastiche may yet find satisfaction in derivative, AI-generated music or painting or poems. But if we want art that can show us the world not as it is, but as it might be, to help us navigate the most vexed moral questions of the day, we will have to be content with the power and limits of human expression.

Woman at a piano (1870) - Giovanni Baldini

I’m not claiming to have a distinct voice; I offer some of my own cultural touchpoints as an example of how this phenomenon, in my view, operates.

Andrea La Rose

Mar 20, 2024

Thanks for some necessary reading! My hot take:

I think another issue here is that too many people have fallen for the idea that the brain is simply a fancy computer. All we have to do, then, is build a computer that is just as fancy as the brain! Then we have the best AI — another brain we can rent out, that does what we tell it to do, because it doesn't have any of those pesky emotions, or personalities, or opinions, or any of those annoying traits we find in other people. Plus, we can pay less for AI because we don't have to feed it and it doesn't have snotty-nosed kids to care for. This is the promise of AI — it's the ultimate slave! All of the work, all of the product, all of the profits, and none of the effort.

The brain, of course, is not a fancy computer, and furthermore, it's housed in a body, which is probably more properly seen as an extension of the mind. That's why a lot of people use the term "bodymind" these days; you can't so easily separate the two in reality. What is missing from AI is embodied knowledge — that's where creativity comes from, because everybody has their own bespoke bodymind. AI scans all the products of all the bodyminds, but doesn't work like a bodymind itself. It can produce a simulacrum, but never the thing. AI has no taste, both literally and figuratively. As you point out, it doesn't matter how many artworks it can "consume."

Expand full comment

2 replies

Brad

Mar 27, 2024

Well-written and thought-provoking Gabe, and comments from others as well. Regarding Saint Saens, there’s nothing of his that’s grabbed my passion or imagination, which is unusual. On the other hand Bertrand Chamayou just gave a plug when I talked to him after a performance for S.S.’s Piano Concerto. Will give it a go.

I wonder: would it be possible to write music that would be unassimilatable by AI? What would you do? Never develop themes, constantly change harmonic reference, never get on a rhythmic grid…Hmm…nah

7 more comments...

Gabriel Kahane: Words & Music

Discussion about this post