Using corpus studies to find the origins of the madrigal

A recurring topic in musicology is the origin of the madrigal. Did it come from the frottola, the motet and chanson, or other Italian traditions? MS Florence, BNC, 164-167 (c. 1520) has four sections, each devoted to a different genre: madrigals, other Italian-texted genres, chansons, and motets. These sections provide evidence of genre classification from the period. We encoded the 82 pieces in the manuscript and used jSymbolic to extract 801 features from each file. We then used Weka to train classifiers to identify the pieces in the different sections. This allowed us to test the claims of earlier scholars as to similarity or difference between the madrigals and the other genres. The classifiers could distinguish the other Italian-texted genres from the madrigals only 72% of the time, compared to 100% of the time for the motets and chansons, suggesting that the madrigals are more similar to other Italian-texted pieces than to the other genres. Features based on rhythm were particularly effective in separating the genres, especially in discriminating madrigals from motets.


Introduction
Einstein's view that "the genesis of the madrigal…is the transformation of the frottola from an accompanied song…into a motet-like polyphonic construction with four parts of equal importance" (1949, p. 21;Rubsamen, 1964, p. 35) has been rejected. Iain Fenlon and James Haar's study of the sources (1988) suggests "less an evolution from one genre to another than an existence of two distinct traditions" (Carter, 1992, p. 87): the frottola is associated with North Italian courts and print sources, while the first madrigals are found in Florentine manuscript sources (Fenlon and Haar,(14)(15)(16)(17). Haar proposed that the new style derived from the motet and chanson (1986, 53, pp. 64-66), while Fenlon, Haar, and Carter emphasized the importance of the chanson (Fenlon and Haar, 1988, p. 7;Carter, 1992, p. 89). Cummings (2004, 12, pp. 53-62) stresses Florentine traditions of carnival song and improvised solo song, as well as the role of the villotta (a popular genre from Northern Italy often found in Florentine sources).
In any case, if the madrigal did not derive from the frottola, what were its stylistic roots? What are its distinguishing features? Unlike the chanson and motet, genres that emerged in the middle ages, the madrigal appeared quite suddenly. Zoey Cochran and Cumming connect the emergence of the madrigal with the claim that modern Florentine was the best candidate for a standardized literary Italian, as part of the debate on the "questione della lingua" (Cumming & Cochran, 2018). We suggest that members of the Orti oricellari group (Florentine intellectuals who met at the Rucellai Gardens; Cummings, 2004) commissioned local composers to create a new genre, the madrigal, that set poems in modern Florentine Italian by Petrarch and by local poets in a high style quite different from that of the northern Italian frottola. The earliest surviving madrigals are by Bernardo Pisano and Sebastiano Festa. If they invented a new genre, did they draw some of the musical features from contemporary genres? If so, which genres, and which features?
To investigate this question we chose to focus on the contents of a key manuscript source of the earliest madrigals, copied c. 1520, Florence, Biblioteca Nazionale Centrale, MSS Magl. XIX. 164-167 (Florence 164), identified by Cummings (2004, p. 62) as "a kind of musical manifesto of the 'program' of the Rucellai group." There are no composer attributions in the manuscript, but most of the pieces in the manuscript have concordant sources with attributions, and can be connected to known composers. The manuscript serves as a snapshot of Florentine musical culture of the period: it has four sections that correspond to the gathering structure of the manuscript (Cummings, 2006, pp. 6-7), and each one is devoted to a different musical genre, or group of associated genres (Table 1). Section 2, while varied, is dominated by the villotta and closely related genres such as the zibaldone, the protovillotta, and the canzone di Maggio.

Method
Comparing multiple pieces in one genre to multiple pieces in another is a complex task, and difficult to do by hand. We therefore chose to utilize an automated approach involving feature extraction, machine learning and information gain analysis.
The first step was to digitize Florence 164 using a consistent editorial and encoding workflow; as noted by Cumming et al. (2018), inconsistent digitization practices can produce biased results when employing automated analysis techniques. The music was manually transcribed with Sibelius, using original note values, and including only accidentals found in the MS. A MIDI file was exported for each of the 82 pieces.
Next, we used the open source jSymbolic 2.2 (McKay et al., 2018) software to extract features from each of these MIDI files. The term "features" has different meanings in different disciplines; here, we define a feature as a numerical measurement of a single, precisely defined musical characteristic that can be extracted from a digital score. In this case, features were extracted globally for each piece. jSymbolic 2.2 can extract 1497 feature values measuring a diverse range of characteristics, including: pitch statistics, melody / horizontal intervals, chords / vertical intervals, texture, rhythm, instrumentation, and dynamics. Only 801 of these features were used in this study, however, as features associated with instrumentation, dynamics, tempo, etc. are not relevant to this repertoire.
Once the features were extracted, they were used as the input to machine learning and statistical analysis processing. Supervised learning involves building a classification model that can map novel inputs into classes of interest by first training on dedicated training data. We trained such models on all four sections of Florence 164, in order to see how effectively the features could differentiate the genres. As discussed in more detail below, we used classification accuracy as an imperfect but useful proxy for similarity; the more difficult it is to distinguish between pieces belonging to two given genres, the more similar those genres may be said to be, at least with respect to the particular features considered. We used the broad jSymbolic feature catalogue specifically because we wanted to be able to consider as diverse a range of characteristics as possible.
We used a ten-fold cross-validation methodology to carry out classification experiments. This meant segmenting the data into ten pairs of training / testing folds, such that each piece served as a training instance nine times and as a test instance once. This permits one to evaluate the effectiveness of classifiers in a way that minimizes risks of overfitting. The well-known support vector machine (SVM) algorithm was used to train the classifiers, as it performs well on relatively small datasets like Florence 164. More specifically, we used the SMO SVM implementation from the open source Weka (Witten et al., 2016) data mining package, with a linear kernel and default hyperparameters.
We were also interested in seeing which features separated madrigals from each of the other genres. This can be a difficult thing to do perfectly: groups of features can vary together in subtle ways that may be modelled successfully by classifiers, but which are difficult to express clearly in human interpretable ways. For the sake of simplicity, we used information gain as a simple entropy-based metric for measuring how well features considered individually separate the genres.
In particular, we used Weka's InfoGainAttributeEval metric, which outputs a value between zero and one for each feature, with a higher value indicating that the feature is more useful in distinguishing the genres in question.  Table 2 shows the classification accuracies resulting from the cross-validation experiments. The first row of results indicates that 65.9% classification accuracy can be attained when considering all four genres as candidates. However, not all genres are equally represented in the dataset, which can potentially bias results; additionally, the classifier may not necessarily be equally effective with respect to all classes. It is therefore useful to examine the confusion matrix for this experiment (Table 3), which provides more detail on how the individual genres performed in this experiment. Table 3 reveals that madrigals could be distinguished from the other genres almost perfectly in this four-genre experiment (only one of the twentyseven madrigals was misclassified), but that chansons were sometimes confused with other Italian-texted pieces (OITs) and motets. It is also interesting to note that, although only one madrigal was misidentified as an OIT, ten of the nineteen OITs were misclassified as madrigals.

Results and Discussion
Further insights can be gleaned from pairwise crossvalidation experiments, where the classifier only needs to choose between two candidate genres, rather than four, an easier and more specialized kind of classification problem. The additional rows of Table 2 outline the results of such pairwise experiments, which are mostly consistent with the findings from the fourgenre experiment. Chansons prove once again to be relatively difficult to distinguish from motets and OITs, but not madrigals. Motets were once again relatively easy to distinguish from OITs, and madrigals could be easily distinguished from motets and chansons, but not OITs.
Since difficulty in accurately discriminating between classes can be considered an imperfect but useful proxy for similarity, we can interpret these results as suggesting that madrigals are closer in terms of musical content to OITs than to motets or chansons, since they are harder to distinguish from one another. It is worth reemphasizing, however, that this classification difficulty is associated more with confusing OITs with madrigals than the reverse.
There is a caveat that should be mentioned here: the OIT group includes music belonging to several genres, so it is possible that the difficulty in distinguishing madrigals from OITs could be at least partially due to difficulty in training a model that fully encompasses the range of music in this diverse group. However, Table 2 shows that chansons (to a small extent) and motets (to a large extent) could be better separated from OITs than madrigals, so although the possibility of an occluding influence resulting from the diversity of OIT's sub-genres should not be discounted, this influence, even if present, probably does not account for all the difficulty in separating OITs from madrigals. This would ideally be investigated using more music from each of the OIT sub-genres, but given the limited data available, it is still reasonable to suspect that the cross-validation results do ultimately indicate stronger similarity between madrigals and OITs than between madrigals and either motets or chansons.
As noted above, we used information gain as a rough metric for providing insight into which features were individually statistically most effective in separating the genres. Tables 4 to 6 show the ranked top ten individual features in each of the pairwise comparisons involving madrigals. The jSymbolic manual (http://jmir.sourceforge.net/manuals/jSymboli c_manual/home.html) can be consulted for detailed explanations of the features appearing in these tables; some of the features measure qualities familiar to music theorists, but others are novel.
It is notable that there are a good number of features with high information gains in the madrigals vs. motets and madrigals vs. chansons comparisons. There are important rhythmic differences between the madrigals and motets, as indicated by the fact that the top nine features in Table 4 are all associated with rhythm. Motets tend to use many more long note values and have more varied rhythmic values, while madrigals tend to have long strings of minims (half notes). Table 6 shows that vertical and rest-associated features are most effective individually in separating madrigals from chansons. These features are associated with imitative texture: the chansons have fewer simultaneous pitches, and more partial rests (rests in one to three voices) than do the madrigals.
The features distinguishing madrigals from OITs on Table 5 also include strong rhythmic representation, but are supplemented by features measuring texture, verticality and melody. The features are more varied, and it is harder to associate them with musical features that are easy to describe. There are no individually strong features in the madrigals vs. OITs comparison; indeed, the highest information gain of 0.388 in Table  5 would not be anywhere near the top ten in either of the other two comparisons. This further supports the similarity between madrigals and OITs suggested by the cross-validation experiments, since no individual features exhibit patterns that easily distinguish madrigals from OITs in a broad sense.
Overall, these results suggest many intriguing areas of further investigation, including detailed analyses of how the individual features vary from piece to piece and genre to genre, and a more sophisticated statistical investigation of the discriminatory power of feature groups (as opposed to just individual features).
It is also important to emphasize that statistical feature analyses of these types ultimately require expert musicological confirmation of salience, in order to verify that the statistically discriminative power of any given feature is not just due to statistical coincidences without musical significance. However, the initial exploratory approach employed here is quite useful for highlighting areas for further inquiry.

Conclusion
Our results indicate that the madrigals are closer in musical style to the OITs than they are to either the chansons or the motets, and cast doubt on the claims of Fenlon, Haar, and Carter that the madrigal derived its style from the chanson and motet. The similarities between the madrigals and the OITs support Cummings's emphasis on Italian song traditions, and especially to the villotta (although our particular experiments here did not separate out the villotte from other OITs). Correlation is not equivalent to historical causation -so musical similarities between madrigals and villotte do not necessarily mean that the madrigal "came from" or "evolved out of" the villotta. However, causation normally does involve correlation. So if we observe similarities (correlations) between the madrigal and the OITs, for example, it is possible that causation is involved, that composers of the first madrigals may have taken the villotta as a model or a source for their new genre. Practices of Italian text setting (such as long strings of syllabic minims) could also have had an impact, although jSymbolic does not provide direct insight here, as it does not extract features from the texts, only from the music.
Connections between the villotta and the madrigal are supported by other extra-musical factors as well. Sebastiano Festa wrote both madrigals and villotte; Pesenti, known for his villotte in Florence 164, also wrote one surviving madrigal. The fact that many villotte are mixed in with madrigals in the manuscripts of the later 1520s also suggests that the two were seen as related genres.
In future work we will break down the various genres included in the OITs, add more villotte to our corpus, expand our research to include the madrigals of Verdelot copied in the 1520s, and analyze feature values more deeply. The ability to extract musical features and analyze them statistically has already shed new light on one of the enduring problems of musicology.

Corpus
All the pieces in Florence 164 used in this study are available as MIDI and PDF files at https://zenodo.org/record/4451464#.YAdwE-hKj-g. Pre-extracted feature values are also posted there.