Automatic Composition of Music Using a Genetic Algorithm, Emotional Musical Theory and Machine Learning

Lara, Adriana; Guzmán, Giovanni; Vilchis, Natan; Lara, Adriana; Guzmán, Giovanni; Vilchis, Natan

doi:10.13053/cys-27-2-4646

Serviços Personalizados

Journal

Artigo

Indicadores

Citado por SciELO
Acessos

Links relacionados

Similares em SciELO

Mais
Mais

Permalink

Computación y Sistemas

versão On-line ISSN 2007-9737versão impressa ISSN 1405-5546

Comp. y Sist. vol.27 no.2 Ciudad de México Abr./Jun. 2023 Epub 18-Set-2023

https://doi.org/10.13053/cys-27-2-4646

Articles

Automatic Composition of Music Using a Genetic Algorithm, Emotional Musical Theory and Machine Learning

Adriana Lara¹

Giovanni Guzmán²^*

Natan Vilchis¹

¹1 Instituto Politécnico Nacional, Escuela Superior de Física y Mateméticas, Mexico. alaral@ipn.mx, contacto@natanvilchis.org.

²2 Instituto Politécnico Nacional, Centro de Investigación en Computación, Mexico.

Abstract:

This work proposes a mathematical model for computer-aided music composition as a multi-objective optimization problem. This work aims to create a framework to automatically generate a set of songs with two melodies by combining a genetic algorithm with machine learning. Musical patterns were studied [16, 6, 18, 2] to simplify them and apply them for the construction of the optimization model. This work uses recent emotional music theory to construct the optimization problem [11]. Three conflicting objective functions represent the desired characteristics of the melody to be created: (1) song happiness, (2) song minimalism, and (3) song genre. Two of these objectives are analytically designed, fulfilling well-studied features like those in [14, 25, 11]. The third objective function was developed using a machine learning model like in [5, 8, 27]. The software JSymbolic is used [15] for extracting features in real-time and getting the score with the machine learning model trainer in the present work. The results obtained by this work can be listened to by test examples presented in a video format.

Keywords: Music composition; multiobjective optimization; evolutionary music

1 Introduction

Musical composition using artificial intelligence is one of the great challenges due to the different characteristics that a song contains; in addition, the different types of properties of a song make its study difficult.

Some artificial intelligence tools used for this purpose are: deep learning, generative music, quantum computing, machine learning techniques, among others [²¹, ⁵, ⁸, ²⁷]. In the present work, the problem of computer-assisted musical composition is modeled in terms of a multi-objective optimization problem.

To generate songs with a maximum happiness, while simultaneously looking for minimalism in the ornaments and looking for songs in two different genres as a third criterion.

For this, Mauro de María’s emotional music theory [¹¹], classical music theory [⁷], study of musical patterns [¹⁶, ⁶, ¹⁹, ¹⁸, ²², ¹⁰, ², ¹⁷] were augmented and related state-of-the-art works such as MetaCompose [²⁵] and MorpheuS [¹⁴] that provide valuable elements for our model. The present work is a combination of existing elements, such as the definition 9 (based on equation 4 of [²⁵]).

The first four terms of the g3 function of the optimization problem are based on [¹⁴] together with combination and simplification of [¹⁶, ⁶, ¹⁹, ¹⁸, ²², ¹⁰, ², ¹⁷] and with original contributions based on emotional music theory [¹¹] present in the other elements of the optimization problem.

In addition, to evaluate the third objective to optimize, a model generated with the help of machine learning through the extraction of features from songs of different genres is used.

2 Limitations

There are different limitations in the current work, which are described below:

– The songs generated by the present work have at most two notes playing at the same time. This is because songs have been defined as two melodies playing simultaneously. An example of the song representation will be detailed in section IV, basic concepts.
– The parameter p (bar partitions) will determine the number of notes within a bar. A larger number of p allows a larger number of musical figures to be written to each melody. For example, if p=2, each melody will allow musical figures of value 12 () and 22 (), if p=4, each melody will allow musical figures of value 14 (), 24 (), 34 () and 44 () and so on.
– A larger value of b (number of bars in the song) or p (bar partitions) implies more time in optimization process; that is, increasing the length of the song increases the time to find feasible solutions.
– The generated songs are based on the evocation of happy songs (objective g1) and minimalist objective (objective g2), other emotions and other musical aspects can be taken into account by adding other objectives in the optimization problem.

3 Related Work

There are different related works that allowed to generate the model shown in the optimization problem section, each one of them and the way in which their valuable ideas were applied are described as follow:

– The musical emotional theory elaborated by Mauro de María [¹¹] provides a great perspective about the different parts that a musical composition has and how each of its parts influences the evocation of emotions. For the elaboration of the present work, a simplification of Mauro de María’s theory was carried out to model happy and minimalist songs. The main ideas of Mauro de María were applied mainly in the definitions 4 , 23 and 14.
– There are different related works [¹⁶, ⁶, ¹⁹, ¹⁸, ²², ¹⁰, ², ¹⁷] that address the problem of finding musical patterns successfully. In the elaboration of the present work it was tried to use some of these algorithms in the optimization problem; however, it was decided to use a simplification of these given that they are very expensive and the generation of results could take maybe a few months. These simplifications appear in the first four terms of the g3 function of the optimization problem.
– MetaCompose [²⁵] and MorpheuS [¹⁴] propose structured music generation. In the case of MorpheuS, a tension pattern must be provided to ensure that the result is adequate. In our work, this tension is reflected in the functions g1, g2 and g3, while the structure of the song is reflected in the constraints of the optimization problem.

4 Basic Concepts and Fundamental Elements of the Model

We have represented a song as a vector x∈ℤb+2bp, where the first b components denote the chord index for the corresponding bar; the following bp components represent the notes of harmony; finally, the last bp components represent the notes of the melody. In this way, each melody x(1)∈ℤbp has b∈ℕ bars and, in turn, each bar is divided into p partitions, with p∈{2,4,8,16}; furthermore, each component 41≤xj≤88, for j∈{b+1,…,2bp} is a MIDI note [¹]; where MIDI note 41 will be taken as a elongation of the previous note, while MIDI note 42 will be used as rest. The next score shows the representation of a song corresponding to the vector x0∈ℤ4+2(4)(8) given by:

x0=[1, 11, 11, 1, 48, 48, 52, 52, 62, 64, 55, 42, 65, 65, 69, 69, 76, 77, 65, 42, 53, 53, 57, 57, 67, 69, 60, 42, 60, 60, 64, 64, 76, 88,84, 42, 42, 41, 41, 71, 41, 83, 79, 88, 42, 41, 41, 81, 41, 84, 83, 84, 72, 77, 41, 71, 77, 41, 41, 41, 83, 84, 41, 83, 88, 41, 41, 41]T. (1)

That said, we consider a melody to be a vector of MIDI notes [¹], the definitions 15, 16, 17, 18, 19 and 20 were built from the classical music ornaments theory [²³], while the remaining definitions were created from the emotional musical model [¹¹] and adapted to the optimization problem of this work. The necessary definitions for our model will be shown below.

Definition 1 (Melody of the j-th bar of a melody). Let x∈ℤbp be a melody, with b,p∈ℕ. The melody of the j-th bar xℬ(j) corresponds to the subvector of x given as:

ℬ(j):=x(1+p(j−1):jp), j=1,…,b. (2)

Definition 2 (Chord). The vector x=(x1,x2,…,xq)T∈ℤq is said to be a chord, if each xi is a numbered musical note, for i=1,…,q and furthermore it is verified that xi≠xj ∀i≠j.

Definition 3 (Melody with a good start). Let x∈ℤk be a melody. The tune x is said to have a good start if the function NS:ℤk→{0,1} evaluated in the tune x is equal to one, where the function NS is defined as follows:

NS(x):={1if ∃j∈{1,…,k}:xj=41,1si min⁡j=1,…,kxj>41j<min⁡ℓ=1,…,kxℓ=41ℓ,0otherwise.

Definition 4 (Total real notes in melody). Let x∈ℤk be a melody, the total number of real notes in a melody is the result of TN(x), where the function TN:ℤk→ℤT N : Z^k→ Z is defined as follows:

TN(x):= number of elements xj greater than 42, j=1,…,k.

Definition 5 (Melody repetition of a melody). Let x∈ℤk be a melody such that ∃xj>42. The repeat melody y∈ℤk of melody x is the result of repeatMelody(x), where the function repeatMelody : ℤk→ℤk is described in the algorithm 1.

Algorithm 1 repeatMelody function to get the repetition melody of a melody.

Definition 6 (Simultaneously Pressed Notes). Let x,w∈ℕ be two MIDI notes, the notes x, w are said to be simultaneously pressed notes if the result of PSN(x,w) is one, where the function PSN is defined as follows:

PSN(x,w):={1,if (x>42)∧(w>42),0,otherwise.

Definition 7 (Proper note for the melody). Let x be a MIDI note, let w∈ℤk be a melody such that ∃wj>42.

Let ND(w)={(xi−1) mod 12 ∀xi>42}∪{(xi+1) mod 12 ∀xi>42}. x is said to be a suitable note for the melody w if the result of MF(x,w) is equal to one, where the function MF is defined as follows:

MF(x,w):={1,if x∉ND(w),0,otherwise.

Using the definition 5, the following definition is built, which will return a vector, where each of its components will be −1, 0 or 1.

Definition 8 (SDR function). Let x∈ℤt be a melody with t≥2, such that ∃xj>42. Let D:ℤt−1→ℤt the first difference function. Let S:ℤt−1→ℤt−1 a vector with the signs of the corresponding elements of argument. The function SDR:ℤt→ℤt−1 is defined as follows:

SDR(x):=S(D(repeatMelody(x))).

Definition 9 (First note of the melody is the root note of the chord). Let x∈ℤk be a melody such that ∃xj>42, let w∈ℤq be a chord. The melody x is said to have as its first note the root note of the chord w if the result of FirstNoteFC(x,w) is equal to one, where the function FirstNoteFC is described below:

FirstNoteFC(x,w):={1,if xr mod 12=w1,r=min⁡xj>42j=1,…,k (j),0,otherwise.

Definition 10 (MIDI note on chord). Let x be a MIDI note, let w∈ℤq be a chord. The MIDI note x will be a note of the chord w if the result of NInChord(x,w) is equal to one, where the function NInChord is defined as follows:

NInChord(x,w):={1,if x mod 12=wj, for mode j=1,…,q,0,otherwise.

Using the definition 10, the following definition is constructed as follows.

Definition 11 (At least fifty percent of MIDI notes are chord notes). Let x∈ℤk be a melody, let w∈ℤq be a chord. At least fifty percent of the notes of the melody x are said to be notes of the chord w if the result of FPN(x,w) is equal to one, where the function FPN is defined as follows:

FPN(x,w):={1if∑xj>42NinChord(xj,w)TN(x)≥0.5,0otherwise.

Using the definitions 10 and 4, the following definition is constructed as follows, which allows to know if the MIDI notes of a melody are notes of a given chord.

Definition 12 (Most MIDI notes in the melody are chord notes). Let x∈ℤk be a melody, let w∈ℤq be a chord. The MIDI notes of the melody x are said to be mostly chord notes if the result of MPN(x,w) is equal to one, where:

MPN(x,w):={1,if ∑xi>42NinChord(xi,w)≥TN(x−1),0,otherwise.

Definition 13 (Beats of a melody). Let x∈ℤk, the beats of the melody x will be the result of BT(x), where the function BT:ℤk→ℤk is defined below:

BT(x):=y, y∈ℤk,

where yj, para j=1,…,k, it’s subject to:

yj={1if xj>42,0if xj=42,−1if xj=41.

Definition 14 (Total notes of the melody in the C Major scale). Let x∈ℤk be a melody, the total number of notes of the melody x that are in the C Major scale S={0,2,4,5,7,9,11} is the result of TNS(x), where the function TNS is defined as follows:

TNS(x):=∑xj>42xj mod 12 ∈ S1. (3)

Definition 15 (Neighbor Ornament). Let x∈ℤ3 be a melody, let u,w∈ℤq chords. x is said to be an embroidery motif for the chords u, w if the result of isNeighbor(x,u,w) is equal to one, where the function isNeighbor is defined as follows:

isNeighbor(x,u,w):={1,if (x1>42)∧(x2>42)∧(x3>42)∧(1≤|x1−x2|≤2)∧(x1=x3)∧(x1 mod 12=uj, for some j=1,…,q),∧(x3 mod 12=wℓ, for some ℓ=1,…,q),0,otherwise.

Definition 16 (Escape ornament). Let x∈ℤ3 be a melody, let u,w∈ℤq be two chords. x is said to be an escape ornament for the chords u, w if the result of isEscape(x,u,w) is equal to one, where the function isEscape is defined as follows:

isEscape(x,u,w):={1,if (x1>42)∧(x2>42)∧(x3>42)∧(1≤|x1−x2|≤2)∧(|x2−x3|>2)∧((x2−x1)(x3−x2)<0)∧(x1 mod 12=uj, for some j=1,…,q),∧(x3 mod 12=wℓ, for some ℓ=1,…,q),0,otherwise.

Definition 17 (Cambiata ornament). Let x∈ℤ3 be a melody, let u,w∈ℤq be two chords. x is said to be a cambiata ornament for the chords u, w if the result of isCambiata(x,u,w) is equal to one, where the function isCambiata is defined as follows:

isCambiata(x,u,w):={1,if (x1>42)∧(x2>42)∧(x3>42)∧(1≤|x2−x3|≤2)∧(|x1−x2|>2)∧((x2−x1)(x3−x2)<0)∧(x1 mod 12=uj, for some j=1,…,q),∧(x3 mod 12=wℓ, for some ℓ=1,…,q),0,otherwise.

Definition 18 (Passing tone ornament). Let x∈ℤ3 be a melody, let u,w∈ℤq be two chords. x is said to be a passing ornament for the chords u, w if the result of isPassingTone(x,u,w) is equal to one, where the function isPassingTone is defined as follows:

isPassingTone(x,u,w):={1,if (x1>42)∧(x2>42)∧(x3>42)∧(1≤|x2−x3|≤2)∧(1≤|x1−x2|≤2)∧((x2−x1)(x3−x2)>0)∧(x1 mod 12=uj, for some j=1,…,q),∧(x3 mod 12=wℓ, for some ℓ=1,…,q),0,otherwise.

Definition 19 (Appoggiatura ornament). Let x∈ℤ2 be a melody, let w∈ℤq be a chord. The melody x is said to be an appoggiatura ornament for the chord w if the result of isAppoggiatura(x,w) is equal to one, where the function isAppoggiatura is defined as follows:

isAppoggiatura(x,w):={1,if (x1>42)∧(x2>42)∧(1≤|x1−x2|≤2)∧(x2 mod 12=wj, for some j=1,…,q)0,otherwise.

Using definition 19, algorithm 2 is defined, which will be used to count the total appoggiatura ornaments of a given melody.

Algorithm 2 countAppoggiatura function to obtain the total number of appoggiatura ornaments given a list of L chords.

Definition 20 (Total anticipation ornaments). Let x,y∈ℤbp be two melodies. The total anticipation ornaments for the melody x and the harmony y is the result of the countAnticipation function, where the countAnticipation function is defined as follows:

countAnticipation(x,y):=∑j=1b−11 such that: (PSN(xℬ(j)p,yℬ(j+1)1)=1)∧(xℬ(j)p=yℬ(j+1)1)..

Definition 21 (Melody ornaments). Let: x,y∈ℤbp be two melodies, let w∈ℤb be a vector containing the chord indices for each bar. Let: L be a chord list. The total ornaments for the melody x is the result of MOrnaments(w,x,L), which uses definitions 15, 16, 17, 18 20 and algorithms 2, 3. The function MOrnaments is defined as follows:

MOrnaments(w,x,y,L):=16(countThree(w,x,isNeighbor,L)+countThree(w,x,isEscape,L)+arg1(w,x,isCambiata,L)+countThree(w,x,isPassingTone,L)+countAppoggiatura(w,x,L)+countAnticipation(x,y)).

Algorithm 3 countThree function to get the total of three note embellishments given a function J and a list of chords L.

Definition 22 (Elongation melody). Let x∈ℤk be a melody, x is said to be a elongation melody if elongation(x) is equal to one, where the elongation function is defined as follows:

elongation(x):={1if x1>42∧(xj=41,∀j=2,…,k),0otherwise.

Definition 23 (Happiness per bar of the song). Let:

L=([0,4,7]T,[2,5,9]T,[4,7,11]T,[5,9,0]T,[7,11,2]T,[9,0,4]T,[11,2,5]T,[0,4,7,10]T,[2,5,9,0]T,[4,7,11,2]T,[5,9,0,3]T,[7,11,2,5]T,[9,0,4,7]T,[11,2,5,8]T)

Be a list with major and seventh chords. Let:

M=[1,−2,−3,2,3,−1,−0.5,1,−2,−3,2,3,−1,−0.5]T,

Be a vector with the happiness level of each chord corresponding to L. Let: w∈ℤb Be a vector containing the indices of the corresponding chord for bar j, for j=1,…,b. The happiness per bar of the song is the result of Happy(w), where the function Happy is defined as follows:

(w):=1b∑j=1bMwj.

Definition 24 (Harmony ornaments). Let y∈ℤbp be a melody, let w∈ℤb be a vector containing the chord indices for each bar, let L be a list of chords. The total ornaments for the harmony y is the result of HOrnaments(w,x,L), where the function HOrnaments is defined as follows:

HOrnaments(w,y,L):=15(countThree(w,y,isNeighbor,L)+countThree(w,y,isEscape,L)+countThree(w,y,isCambiata,L)+countThree(w,y,isPassingTone,L)+countAppoggiatura(w,y,L)).

5 Machine Learning

The third component of the F function was built from a machine learning model. To make the model, 262 songs divided into four genres (classic, film, pop and rock) were used. Each of these songs have two melodies (one for the treble and one for the bass).

Subsequently, 5 features (repeated notes C1, most common pitch class prevalence C2, melodic interval histogram C3, pitch class variety C4 and pitch class distribution C5) were extracted from each of these songs using music21 [⁹] with the JSymbolic [¹⁵] submodule.

Later, K-Means was used to obtain two clusters of the features obtained, with this classification, musical songs were divided into two genres: classical genre in terms of repetition of notes and contemporary genre in terms of freedom of movement.

Then, an AdaBoost regressor h1(C1,C2,C3,C4,C5) with 1 estimator was performed to model these two clusters, where 0 indicates that it corresponds to the first cluster (classic) and 1 to the second cluster (freedom of movement).

Now, since the music21 function to obtain the five features requires a MIDI file or a stream object, a function h2(y,z)=[C1,C2,C3,C4,C5]T, y,z∈ℤbp was used to convert the song from our representation to a stream file. Since the obtained AdaBoost regressor model yields values greater than one and less than zero, the function g4 was used as follows:

g4(y,z)={0.5 if 0<h1(h2(y,z))∨1(h2(y,z))>1,h1(h2(y,z)) otherwise.

6 Optimization Problem

The generation of the desired melodies has been modeled using the following problem:

Minimize F(x)=[g1,g2,g3]T, with x∈Zb+2bp.

For convenience, the vector x is presented as the concatenation of three vectors w∈ℤb and y,z∈ℤbp as x=(w,y,z).

For convenience in writing for some parts of the optimization problem, the notation described in the 1 definition will be used. To model g1, the definitions 4, 14, 21, 23 and 24 were used.

To model g2 the definitions 4, 21, 22 and 24 were used. To model g3 the definitions 8, 13 and g4 function from section V were used. The functions g1, g2 and g3 are minimized simultaneously and are defined as:

g1(w,y,z)=3−TNS(y)+TNS(z)TN(y)+TN(z)−MOrnaments(z,y)−Happy(w)−HOrnaments(y).

g2(y,z)=2+MOrnaments(z,y)+HOrnaments(y)+TN(y)+TN(z)2bp−∑j=1bp−p2+1elongation(z(j:j+p2−1))−∑j=1bp−p2+1elongation(y(j:j+p2−1)).

g3(w,y,z)=∑j=2b‖BT(yℬ(j))−BT(yℬ(1))‖+∑j=2j parb‖BT(zℬ(j))−BT(zℬ(j−1))‖+∑j=2b‖SDR(yℬ(j))−SDR(yℬ(1))‖+∑j=2j parb‖SDR(zℬ(j))−SDR(zℬ(j−1))‖+g4(y,z).

In this way, the vector w will have as components the indices of the chords of each bar, the vector y will have the notes of the harmony and the vector z will have the notes of the melody of the song. Using the definitions 2, 3, 4, 7, 9, 11, 12 and 6, the optimization problem is presented subject to the following constraints:

– b∈ℕ, p∈{2,4,8,16}.

– ∃yj>42 for j=1,…,bp, ∃zℓ>42, for ℓ=1,…,bp.

– L=(L1,…,Lr), where Lj is a chord, for j=1,…,r.

– w1=1, wb=1.

– 1≤wj≤r, for j=2,…,b−1.

– 41≤yj≤88, for j=1,…,bp.

– 41≤zj≤88, for j=1,…,bp.

– yj<zj, ∀yj, zj such that PSN(yj,zj)=1.

– NS(y)+NS(z)−2=0.

– ∑j=0b−1∑k=1pMF(y(jp+k),zℬ(j))−TN(z)=0

– ∑j=1bFirstNoteFC(yℬ(j),Lj)−b=0.

– ∑j=1bFPN(zℬ(j),Lj)−b=0.

– ∑j=1bMPN(yℬ(j),Lj)−b=0.

7 Experimental Results

The NSGA-II multi-objective evolutionary algorithm [¹²] was used to solve the constrained bio-objective optimization problem described in the previous section.

For the parameters of the NSGA-II algorithm, a population of 100 individuals was used, a cross of 2b points [²⁶] with crossover probability pc=0.75, a mutation with the simulated binary crossover distribution [³, ¹³], with a mutation probability of pm=0.05 and η=1.

For the implementation of the code, C++, Python 3, pymoo [⁴] scikit-learn [²⁴] and music21 [⁹] were used. For all the results, the list of chords L presented in the definition 23 was used, where L has the major and seventh chords of the C Major scale.

Different tests were carried out varying the parameters b and p, the results obtained have been placed on the next page ^{^fn}, where some of the solutions found in the Pareto front are shown^{^fn} obtained by the algorithm for each execution, where it is possible to appreciate the difference between greater happiness, greater minimalism and different genre with patron described in the functions g1, g2 and g3 of the optimization problem.

The previous page will also include a video with the Pareto fronts obtained in each execution after optimization process, rotating the graph for better viewing. It should be noted that the songs obtained in the Pareto fronts of each execution are very interesting and creative, so the approach to the problem of musical composition using musical patterns and musical dimensions seems to be a promising way to expand the results and research.

It should be noted that given the task of converting each of the songs in the population to a stream object in real time in order to extract the features using music21 with JSymbolic, it is relatively time consuming, taking between 10 to 40 seconds to complete each generation and its evaluations. corresponding.

For example, for execution 1, the parameters b=p=8 and 6600 generations were used. In the results shown on the page you can see very interesting results, for example, the extreme corresponding to the minimization of the third component of the function F shows a clear harmonic and rhythmic pattern.

Now, the extreme corresponding to the maximization of happiness shows a melodic play between the fourth and fifth bars. Finally, the extreme corresponding to the maximization of minimalism can be seen that the previously defined ornaments are avoided in a greater way.

The figure 1 shows the Pareto front obtained for execution 1 after the optimization process with the mentioned parameters and generations.

Fig. 1 Execution 1: Pareto front obtained, different views

It can be seen that there are high values for the g3 function, where it indicates that the genre can be freer with respect to rhythmic and harmonic patterns.

In addition, the figure 1 highlights the conflict between the functions g1, g2 and g3, showing that there is a relationship between happiness, minimalism and the choice of musical genre. Now, for execution 2, the parameters b=p=8 were used, with 11,200 generations.

Very promising results are shown in the results shown on the previous page; for example, with respect to function g3, a clear difference is verified between the genre of the two songs, where one has a pattern at each bar and the other is freer. In the figure 2 it can be seen that there are four songs whose value in the objective function F are minimum for the third function; in addition, these songs have a value between 0.4 and 0.6 for the function g1; that is, the emotion level of happiness is low, since the higher g1 value, less happiness level in the song.

Fig. 2 Execution 2: Pareto front obtained, different views

For execution 3, the parameters b=8 and p=4 with 21,700 generations were used. As you can see in the videos on the previous page, they are different between the genres, where one is calmer in the harmonic part and the other has much more movement.

Now, the difference between decorations between the extremes of happiness and minimalism are also remarkable.

In figure 3 you can see a less disperse Pareto front compared to previous executions. In addition, the compromise between the functions g1, g2 and g3 can be observed.

Fig. 3 Execution 3: Pareto front obtained, different views

In figure 4, you can see the Parallel Coordinate Plots of each execution, where each line represents a solution and its respective evaluation in functions g1, g2 and g3.

Fig. 4 Parallel Coordinate Plots of the three executions: a) execution 1, b) execution 2, c) execution 3

It can be seen that between the functions g1 and g2 there is a conflict: if the value of g1 increases, then the value of g2 decreases and vice versa; in other words, if the level of happiness increases then the level of minimalism decreases and vice versa.

Now, an interesting behavior can be observed between a) and b) of figure 4, where, despite having the same parameters (b=p=8), there are different densities for the lines between g2 and g3.

This suggests that you can have songs from a different musical genre but with the same levels of happiness and minimalism.

Finally, in c), figure 4, there is an remarkable inverse relationship between happiness and minimalism. In another way, the lines between the function g2 and g3 have a lower density compared to a) and b).

This suggests that for songs with similar levels of happiness and minimalism, different variations of musical genres are possible.

8 Conclusions

In this work, the automatic musical composition problem was modeled through optimization problem. The aim was to create melodies where a set of compromise solutions (Pareto front) was established regarding the level of happiness, as the first objective function, simplicity of ornaments as the second optimization function and the musical genre as a third objective. For the creation of the model, emotional musical theory and related works were used, promising results were obtained by finding auditory diverse elements on the Pareto front in each execution.

The graphs obtained from the Pareto fronts shown in figures 1, 2 and 3 mainly show that there is a relationship between the functions g1, g2 and g3; In addition, the larger the values of b, p (as in figures 1 and 2), the dispersión of the points is greater, since the search space is expanding, compared to figure 3, where the dispersión of the dots is smaller, which appears to form a path. The dispersión of the points in the Pareto fronts is a good indicator that the functions g1, g2 and g3 have a relationship between them and also explore different musical aspects.

In figure 4 it can be seen that there is a conflict between the functions g1, g2 and g3; In other words, to compose a song, it must be considered that increasing one musical aspect (happiness, minimalism or musical genre) will affect the values of other aspects.

We sought to simplify the most important aspects of music, trying to maintain a balance between the creative capacity of the results and avoiding complexity in the problem; for example, for the musical patterns the first four terms of the function were made.

Now, the use of machine learning shows an improvement compared to using the two objectives of the F function only; however, the runtime feature extraction of the optimization algorithm is a costly task since you have to convert each song to a stream object or MIDI file to use JSymbolic.

9 Future Work

The results obtained in the present work are interesting and promising, for this purpose three ways are proposed as future work to improve the results given the learning trajectory with this work:

– Change the song representation to one that allows multiple melodies, for example multidimensional ordered set used in [¹⁷]. With this improvement proposal, it is intended to generalize a song in a better way, allowing to explore more about the space of polyphonic songs to improve the results. However, changing the representation may have to solve other problems, such as identifying and separating the notes corresponding to each melody for individual analysis (melody structure, melody patterns, ornaments, emotional level, among others).
– Use machine learning to extract rhythmic patterns, melodic patterns, and patterns that evoke some emotion. With this improvement proposal it is intended to improve the structure of the song. To make this possible, it is necessary to have a fast pattern detection algorithm, in terms of complexity. To make this possible, it is recommended to study in depth the existing algorithms [¹⁶, ⁶, ¹⁸, ²] to make an improvement that allows reducing their complexity.
– Study emotional music theory [¹¹] in depth to model each dimensión mathematically and its interaction in the evocation of emotions.

References

1. Association of Musical Electronics Industry AMEI and MIDI Manufacturers Association MMA (2020). Universal MIDI packet (UMP) format and MIDI 2.0 protocol. [ Links ]

2. Bertin-Mahieux, T. (2013). Large-scale pattern discovery in music. Ph.D. thesis, Columbia University. [ Links ]

3. Blank, J. (2020). pymoo - mutation. [ Links ]

4. Blank, J. (2020). pymoo: Multi-objective optimization in python. [ Links ]

5. Cataltepe, Z., Yaslan, Y., Sonmez, A. (2007). Music genre classification using MIDI and audio features. European Association for Signal Processing, Journal on Advances in Signal Processing, Vol. 2007, No. 1. DOI: 10.1155/2007/36409. [ Links ]

6. Collins, T. E. (2011). Improved methods for pattern discovery in music, with applications in automated stylistic composition. Ph.D. thesis, Faculty of Mathematics, Computing and Technology, The Open University. [ Links ]

7. Cooke, D. (1959). The language of music. Oxford University Press. [ Links ]

8. Cuenca-Rodríguez, M. E., McKay, C. (2021). Exploring musical style in the anonymous and doubtfully attributed mass movements of the Coimbra manuscripts: A statistical and machine learning approach. Journal of New Music Research, Vol. 50, No. 3, pp. 199–219. DOI: 10.1080/09298215.2020.1870505. [ Links ]

9. Cuthbert, M. S. (2022). music21: A toolkit for computer-aided musicology. [ Links ]

10. Dannenberg, R. B., Hu, N. (2003). Pattern discovery techniques for music audio. Conference Proceedings: Third International Conference on Music Information Retrieval, Taylor and Francis, pp. 63–70. [ Links ]

11. de María, M. (2021). 18 dimensiones emocionales, decodificando las emociones de la música desde 18 dimensiones técnicas. [ Links ]

12. Deb, K., Pratap, A., Agarwal, S., Meyarivan, T. (2002). A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Transactions on Evolutionary Computation, Vol. 6, No. 2, pp. 182–197. DOI: 10.1109/4235.996017. [ Links ]

13. Deb, K., Sindhya, K., Okabe, T. (2007). Self-adaptive simulated binary crossover for real-parameter optimization. Proceedings of the 9th Annual Conference on Genetic and Evolutionary Computation, Association for Computing Machinery, pp. 1187–1194. DOI: 10.1145/1276958.1277190. [ Links ]

14. Herremans, D., Chew, E. (2017). Morpheus: Generating structured music with constrained patterns and tension. IEEE Transactions on Affective Computing, Vol. 10, No. 4, pp. 510–523. DOI: 10.1109/taffc.2017.2737984. [ Links ]

15. McKay, C., Fujinaga, I. (2006). jSymbolic: A feature extractor for midi files. International Conference on Mathematics and Computing. [ Links ]

16. Meredith, D. (2013). COSIATEC and SIATECCompress: Pattern discovery by geometric compression. International Society for Music Information Retrieval Conference, , No. 14. [ Links ]

17. Meredith, D. (2014). Compression-based geometric pattern discovery in music. 4th International Workshop on Cognitive Information Processing (CIP), pp. 1–6. DOI: 10.1109/CIP.2014.6844503. [ Links ]

18. Meredith, D. (2019). RECURSIA-RRT: Recursive translatable point-set pattern discovery with removal of redundant translators. Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 485–493. [ Links ]

19. Meredith, D., Lemström, K., Wiggins, G. A. (2002). Algorithms for discovering repeated patterns in multidimensional representations of polyphonic music. Journal of New Music Research, Vol. 31, No. 4, pp. 321–345. DOI: 10.1076/jnmr.31.4.321.14162. [ Links ]

20. Miettinen, K. (2012). Nonlinear multiobjective optimization. Springer Science and Business Media, Vol. 12. DOI: 10.1007/978-1-4615-5563-6. [ Links ]

21. Miranda-Reck, E. (2021). Handbook of artificial intelligence for music: Foundations, advanced approaches, and developments for creativity. Springer Nature. DOI: 10.1007/978-3-030-72116-9. [ Links ]

22. Ren, I., Volk, A., Swierstra, W., Veltkamp, R. C. (2020). A computational evaluation of musical pattern discovery algorithms. DOI: 10.48550/ARXIV.2010.12325. [ Links ]

23. Rodríguez Alvira, J. (2022). Funciones armónicas: Notas de adorno. http://www.teoria.com/es/aprendizaje/funciones/adorno/index.php. [ Links ]

24. Scikit-learn (2022). Machine learning in Python. scikit-learn 1.1.1, documentation. http://www.teoria.com/es/aprendizaje/funciones/adorno/index.php. [ Links ]

25. Scirea, M., Togelius, J., Eklund, P., Risi, S. (2017). Affective evolutionary music composition with MetaCompose. Genetic Programming and Evolvable Machines, Vol. 18, No. 4, pp. 433–465. DOI: 10.1007/s10710-017-9307-y. [ Links ]

26. Umbarkar, A. J., Sheth, P. D. (2015). Crossover operators in genetic algorithms: A review. ICTACT Journal on Soft Computing, Vol. 6, No. 1, pp. 1083–1092. DOI: 10.21917/ijsc.2015.0150. [ Links ]

27. Vatolkin, I., McKay, C. (2022). Multi-objective investigation of six feature source types for multi-modal music classification. Transactions of the International Society for Music Information Retrieval, Ubiquity Press, Vol. 5, No. 1, pp. 1–19. DOI: 10.5334/tismir.67. [ Links ]

http://natanvilchis.org/micai2022/

The Pareto front [[xref ref-type="bibr" rid="r20"]20[/xref]] is the set that gives a solution to a multi-objective problem. For reasons of space, these definitions have been left out of this text.

Received: June 14, 2022; Accepted: September 18, 2022

^* Corresponding author: Giovanni Guzmán, e-mail: jguzmanl@cic.ipn.mx

This is an open-access article distributed under the terms of the Creative Commons Attribution License