Music and Artificial Intelligence: a creative dilemma that deserves clarity

In this article, I would like to analyze the relationship between music and generative artificial intelligence (AI) models, assuming that the reader is not an expert in machine learning and that, like many, he or she has been “bombarded” by a barrage of news about increasingly powerful and competitive models.

In particular, I am interested in trying to briefly explain what a generative model is (without any formalism or assumed knowledge of mathematics), how it is trained, and what results it can produce. Schematically, the questions I would like to answer are:

  • Can a generative model create new music?
  • What does “new” mean in the context of these artificial intelligence models?
  • What are the possibilities and limitations of these models?

I will avoid formalism to make the article usable by a diverse audience. Still, I will maintain the rigor necessary to avoid annoying drifts toward meaningless ideological positions. Advances in artificial intelligence have led to impressive results that were unimaginable ten years ago, and many artists can benefit greatly from it. However, as is generally the case with technological products, overconfidence and fantasy can fuel unhealthy ideas without a scientific basis.

So, let us wade through dozens of cryptically named innovations to answer the question: can a generative model based on artificial intelligence compose music like a human being?

An artificial intelligence-based robot that composes and plays music

What is a generative model of artificial intelligence (in a nutshell)

A generative model is an artificial intelligence (AI) model that aims to generate new data similar to the input data it was trained on. These models are designed to understand patterns and relationships within the data to create something new. Generative models work by learning the underlying structure of the input data and then using that knowledge to generate new samples.

Before continuing, it is good to clarify what “champions” means. To do so, I will refer to a well-known dataset created by the U.S. MNIST to train early Optical Character Recognition (OCR) models. The dataset consists of 70,000 digit samples (0 to 9) handwritten by different subjects.

Examples of figures taken from the MNIST dataset for artificial intelligence model abstraction
Examples of handwritten figures taken from the MNIST dataset.

As can be seen, each figure is reported in 7000 “different” versions produced by the manual action of as many people. Although such a dataset was created to generally train so-called supervised models (i.e., each image has an associated label indicating the number represented, and the model must learn to transform written digits into encoded values), it can also be beneficial for explaining to the uninitiated how a generative model works.

If you analyze the row of figures representing “5,” you can immediately realize that although the structural elements are the same (e.g., horizontal rod at the top, semicircle at the bottom, etc.), each sample differs. In other words, there is diversity in regularity.

Without going into mathematical discussions, we can say that a generative model is based on all the “forms” used for training to determine familiar abstract characters. The ” generative ” stage will refer to these elements to randomly draw a sample from the learned probability distribution. In straightforward terms, it draws lots among potentially infinite representations and produces output that respects the abstract rules while being completely new (we will return to this point soon).

One of the main features of generative models is their ability to create realistic and diverse outputs. Examples of recent generative models include GPT-3 (Generative Pre-trained Transformer 3), developed by OpenAI, known for its language generation capabilities and ability to produce human-like text. Another notable example is StyleGAN2, a model developed by NVIDIA to generate high-quality images with realistic detail.

Generative models have many applications, including text generation, image synthesis, and music composition. Their versatility and ability to create new content make them a powerful tool in artificial intelligence. However, before using them unceremoniously, knowing their possibilities and limitations is good.

A generative artificial intelligence model that produces music, lyrics and images
A generative artificial intelligence model that produces music, text, and images based on abstract rules learned during the training phase (based on millions of human-created examples).

Summing up:

Pros

Artificial intelligence generative models trained with millions of examples can create constructions based on the combination of imitative elements, opening up new creative possibilities in music, art, and literature. These models can harmonize different elements, offering new compositional and stylistic perspectives. They respect the syntactic-semantic rules learned during training, ensuring consistent, high-quality production.

Cons

Generative artificial intelligence models can only be created based on what was used in their training, lacking human imagination and originality. They cannot go beyond what they previously learned, limiting their ability to create something unique and innovative.

Music and creativity

Musical creativity throughout history has given rise to a wide range of styles, ideas, and innovations that have marked artistic production. Composers have experimented with and combined melodic, harmonic, contrapuntal, instrumental, timbral, rhythmic, and many other elements, creating new musical forms and challenging existing conventions.

Often, artists overcame traditional rules and restrictions to create unique and surprising works that could not be equated with previous ones. For example, harmony has undergone significant changes throughout musical history, some of which have led to the introduction of dissonances and atonality, opening up new horizons of expression for musicians.

Musical creativity has proven to be an unstoppable force that continues to shape and redefine the artistic landscape. In other words, it is possible to observe similarities between works by the same composer and written in historically close periods. Still, true revolutions were rarely based on applying abstract principles to different themes and melodies.

To better understand what I mean, we can ask ourselves two questions:

  1. Why did Mozart, when he composed Symphony 35, not compose Symphony 40?
  2. How can Schoenberg’s work be reconciled with Bach’s?

I chose these two questions with a specific purpose that will soon become clear. Let us, therefore, try to answer the first one.

Did Mozart compose music by drawing lots?

The answer seems obvious. If the Salzburg genius had “simply” used a die to compose his music, he could have written hundreds of symphonies in just a few months, but he did not. Each choice was based (more or less) on what is usually called “inspiration,” from the flourishing and development of an idea with a definite intentionality.

The opening motif of Symphony No. 40 could indeed be produced by a computer program that generates thousands of combinations (without any need for artificial intelligence), but even so, would that be enough to develop 30 minutes of music? More importantly, what role does the desire to express a particular affection and emotion play?

The choice of a minor key, a pressing rhythm, and melodic movements accompanied by a harmony that alternates between tension and temporary relaxation all seem to result from a deliberate decision, not from blind chance. In other words, Mozart’s “generative process” appears much more deterministic than one imagines, especially when it involves breaking abstract rules to do something never heard of before.

At this point, we can address the second question.

Schoenberg and Bach compose and play music together
A 20th-century Schoenberg and a late Baroque Bach compose and play music in an abstract and surreal setting.

Bach’s (natural) intelligence and Schoenberg’s: when music surprises itself

Johann Sebastian Bach, a leading figure of the Baroque period, was a German composer and musician renowned for his exceptional ability to work with harmonic polyphony. Born into a family of musicians, Bach’s legacy includes a vast collection of compositions that showcase his mastery of complex counterpoint and intricate harmonies. His innovative approach to music theory and composition revolutionized the musical landscape of the time, earning him a reputation as one of the greatest composers ever.

Bach’s deep knowledge of polyphonic textures enabled him to create rich and intricate musical weavings that fascinate audiences worldwide. His meticulous attention to detail and unparalleled mastery are evident in the Brandenburg Concertos, the Well-Tempered Clavier, the two Passions, the Cantatas, and the Mass in B minor. Through his unparalleled ability to work with harmonic polyphony, Bach created a diverse repertoire that includes sacred choral music, solo instrumental works, and keyboard compositions, leaving an indelible mark on the history of music.

Now let’s jump ahead and seat the Kantor at the theater where Arnold Schoenberg’s 5 Pieces for Orchestra Op. 16 is about to be performed. What could we expect? Perhaps Bach’s manners (very unquestionable, given the evidence) would prevent an outburst of anger. Still, I am sure that in his heart, the Leipzig master would think it either a joke or that he was witnessing the work of a composer interned in an asylum.

Revolutionary music vs. artificial intelligence: a duel that cannot take place

This is precisely the same point of view as a generated artificial intelligence model: the never-seen, never-read, never-heard equals, at best, an extremely improbable phenomenon. Schoenberg’s (brilliant, in my opinion) idea of emancipating successions of unresolved dissonances was a revolution of the status quo that had never been heard before. So, if such tracks are not used for training a model, how can the model produce them?

The answer is simply that no good model would produce such samples, and if it did, they would be symptomatic of a malfunction, poor structure, or wrong training. Put: if artificial intelligence amazes us, it is very likely that, at the same time, it will significantly disappoint us (especially the engineers who designed it).

The history of music in an abstract image
The history of music in an abstract image: every revolution passes through the vision of something that lies beyond the ordinary rules.

Conclusions

Artificial intelligence-generated models can generate good music according to predetermined rules but cannot invent anything genuinely revolutionary. They can be used for various creative purposes (such as writing backing tracks or composing ambient or background music to be rehashed at will).

However, genuine compositional art requires transcending abstract rules to create true novelty. Generative artificial intelligence models have specific limitations and possibilities, as their training is based on existing creations. Therefore, it becomes impossible or unlikely for them to overcome the rules they have learned to invent new ones.

Fear that artificial intelligence will supplant humans in tasks of pure creativity (such as music, poetry, painting, etc.) is an unfounded fear that, for the time being, has no reason to exist. Man has always been driven toward breaking the rules, enabling the creation of the world’s most important works. As long as cultural revolutions are the beating heart of the advancement of civilization, no “converting” model can ever take the place of an inherently “progressive” man.

So, let us be surprised by the pure inventions of man and try to make the most of technological tools, which, for their part, deserve all our respect, as they, too, are products of human intellect.

The images featured in this article were generated with artificial intelligence. As you can tell, none of them surprised me with “abstract” choices I have never seen before.


If you like this post, you can always donate to support my activity! One coffee is enough! And don’t forget to subscribe to my weekly newsletter!


Share this post on:
FacebookTwitterPinterestEmail

Related Posts