Music surface and musical structure: The role of abstraction in musical processing

Because of its temporal nature, music presents a unique challenge to the perceptual systems. To understand music one must infer underlying musical structure based on a musical surface that is constantly changing. Accordingly, a central component of musical behavior involves the abstraction of underlying musical structure from the musical surface. The following paper discusses the central importance of such abstraction, looking at examples of the role of abstraction based on a variety of underlying representational structures (tonal hierarchies, tonal-metric hierarchies, melodic patterns). These examples support the idea that musical understanding is fundamentally driven by the apprehension of structural patterns, and not by auditory surface information.


Introduction
Because of its ephemeral nature, music represents a challenge to the perceptual systems. Even with reference to other temporal arts (drama, dance, poetry) music is unique. Both drama and dance retain critical visual elements, whether they involve delineating event sequences such as in drama (time provides a framework for a series of actions) or a sequence of movements such as in dance. Even in poetry, in which visual information is minimized, the critical emphasis is on language. Its "structure does not rely solely on the sounds of the words, but rather on a poetic juxtaposition of meanings and connotations" (Stambaugh, 1964, p. 266). Thus, music is unique in its fundamental reliance on the temporal dimension for appreciation of its structure.
Nowhere is this unique nature of audition and music more obvious than in attempts to delineate what defines an auditory object (Brefczynski-Lewis & Lewis, 2017;Kubovy & Van Valkenburg, 2001;O'Callaghan, 2008). Because of the centrality of objects to our experience of the world, the concept of objecthood should translate across perceptual dimensions. Nevertheless, objecthood is in most ways visually-oriented (Kubovy & Van Valkenburg, 2001;O'Callaghan, 2008).
One component of auditory object formation consists of explaining the mechanism(s) by which such objects might be formed. Most commonly researchers point to the process of auditory scene analysis (Alain & Bernstein, 2015;Bregman, 1990Bregman, , 2005Carlyon, 2004) as a mechanism underlying auditory object formation. Although clearly a central process in organizing auditory experience, this framework is limited in equating auditory objects with auditory sources. Within a musical context, such a relation is at best incomplete, overlooking other musical structures that might be critical in understanding music.
One mechanism for creating musical objects involves the abstraction of underlying structural organizations from the musical surface. Interestingly, abstraction of underlying structure is a topic that has only been investigated sporadically over the years (Barsalou, 2005;Posner & Keele, 1968), and within music processing (Deliège, 1996;Deutsch, 1969). According to Barsalou (2005), a critical feature of abstractions is their organization into structured representations. Such ideas lead to the intriguing realization that structured schematic representations could indeed form the basis of musical objects, with such objects consisting of abstracted, schematic musical structures drawn from the musical surface.
Critical to this idea is that abstract musical schematic representations do indeed exist, and that such representations are central to our experience of music. The goal of this paper is to address this issue, reviewing a set of investigations of musical experience in which the primary object apprehended by individuals are schematic representations abstracted out of the musical surface information.

Abstraction in Tonality
Tonality and Key-Finding. In Western music, tonality, or the organization of the chromatic set around a central reference pitch, exists as a central structural principle for musical experience. Classic work by Krumhansl and colleagues (Krumhansl & Cuddy, 2010;Krumhansl & Kessler, 1982;Krumhansl & Shepard, 1979) demonstrated the psychological existence of this theoretic structure, producing the well-known "tonal hierarchy" findings shown in Figure 1. This figure graphs the perceived stability ratings of the chromatic set with reference to a major and minor tonal context. Subsequent work hypothesized that these ratings could be used as an idealized template for the duration of notes in a tonal context, with tonally stable notes occurring with a higher total duration than tonally unstable tones. The result of this work was the Krumhansl-Schmuckler (KS) key-finding algorithm (Krumhansl, 1990;Krumhansl & Schmuckler, 1986;Schmuckler & Tomovski, 2005), which has become one of the preeminent approaches for key-finding in music (Albrecht & Shanahan, 2013;Quinn & White, 2017;Temperley, 1999Temperley, , 2001Temperley, , 2008Temperley & Marvin, 2008).  Krumhansl & Kessler (1982). Tonic triad members appear in blue, diatonic scale degrees in purple, and non-diatonic scale degrees in lilac.
Why is the KS algorithm relevant to this discussion? Simply put, this algorithm operates by abstracting total note durations from the musical surface, collapsing across temporal ordering of tones. Although the KS algorithm has been criticized by multiple authors on a variety of points (Albrecht & Shanahan, 2013;Quinn & White, 2017;Temperley, 1999Temperley, , 2001Temperley, , 2002Temperley, , 2004, this abstraction process ironically has rarely been the focus of such criticism (but see Butler, 1989 for an exception). Regardless, because the KS algorithm operates by creating a global distribution of durations, it is a prime example of the role of abstraction in music processing, one that is fundamental for perceiving tonality.
Tonal-Metric Processing. Recent years have extended this approach by exploring the co-occurrence of idealized tonal and metric information. Palmer and Krumhansl (1990) demonstrated that there also exists a hierarchy in metric perception, with some metric positions heard as psychologically strong beats within a meter, and other positions heard as weak beats.
Based on a corpus analysis, Prince and Schmuckler (2014) demonstrated the existence of an alignment between tonal and metric hierarchies, with tonally important tones occurring at metrically strong positions, and vice versa. Figure 2 shows these findings, graphing the frequency of co-occurrence of pitch and metric information as a function of time signature and musical mode. Overall, these results are compelling -tonal and metric structures aligned, with tonally strong notes occurring at metrically strong positions, and vice-versa.  Prince and Schmuckler (2014).
Recently, Prince et al. (2020) investigated whether this tonal-metric co-occurrence influenced listeners' percepts of musical passages. In this work, melodies were created in which the tonal-metric hierarchy was either aligned (correlated tonal-metric events) or misaligned (uncorrelated tonal-metric events); Figure 3 presents sample melodies from this study. Across a series of studies participants made melodic goodness and metric clarity ratings, and generally found that tonal-metric aligned melodies received higher ratings than misaligned melodies. As with the previous work, this research is important in demonstrating the critical influence of abstracted hierarchical information on general aspects of musical apprehension.

Abstraction in Melodies
Tonal Melodies. Abstraction also plays a significant role in melodic processing For example, Vuvan et al. (2014) examined whether the underlying tonal structure of melodies would drive false memory in listeners, causing them to "fill-in" tone information that did not occur in the melodies, but that was nevertheless consistent with the tonal structure of these melodies. Across multiple studies, listeners showed evidence of such false memories, incorrectly indicating the occurrence of tonally consistent information that was not present in the original melodies. Interestingly, this false memory effect decreased systematically with the psychological stability of the contexts, with major melodies leading to strong tonal frameworks and evidence of false memories, followed by minor melodies, and then finally atonal melodies, which produced no evidence of false memories. Even more fascinating was a subsequent reanalysis of previously unpublished pilot data from this project by Schmuckler et al. (2020). This project examined an earlier version of the atonal study that inadvertently instantiated a pair of tonal centers when one looked at the tonal implications of the entire set of atonal melodies, but that was not apparent on an individual melodic basis. Interestingly, this pilot data did produce false memories for musical information related to these tonal centers. Once this inadvertent tonal information was removed from the stimuli, however, the evidence for these false memories disappeared.
Relevant to the current framework, these studies demonstrate the importance of abstraction in at least two ways. First is the result that abstracted tonal structure of individual melodies led to listeners' false remembering. The confusion of abstract tonal information with actually occurring musical surface information represents the main thesis at hand. Second, and even more fascinatingly, is the demonstration that listeners responded to tonal implications that were only available by aggregating across a set of atonal melodies, but was not present in any individual melodies. Together, these results indicate that abstraction occurs across multiple time scales and stimulus sets, highlighting the potential of this process for influencing musical behavior.
Atonal melodic prototypes. Finally, a recently completed set of experiments examined the abstraction of melodic prototypes, based on hearing distortions of these prototypes. This research was predicated on the work of Posner and Keele (Posner et al., 1967;Posner & Keele, 1968, 1970, who explored abstraction of prototypic visual patterns by training observers to categorize distortions of different prototype patterns, and then testing generalization of this initial learning as a function of the degree of distortion present in the initial learning set. Posner and Keele found that observers who experienced more distorted learning sets showed better generalization of learning. These authors suggested that these more varied learning sets afforded better abstraction of the prototypic patterns, ultimately leading to more robust generalization. In current work by Schmuckler et al. (in preparation), a melodic analogue was produced by creating four different random note "prototype" melodies, along with a set of distortions varying in their degree of distortion by manipulating by the size and frequency of the distorted pitch intervals in these melodies. Figure 4 shows a sample pair of prototypes, along with different levels of distortions for these melodies.

Figure 4: Sample melodic prototypes and distortion levels, for categorization and recognition memory.
Different listeners were trained to categorize varying degrees of distortions of these prototypes, and were then tested for generalization of learning using melodies with a higher distortion level than previously experienced. Figure 5 shows categorization accuracy in the initial learning and generalization phases, and reveals that the difference in performance between these phases decreased systematically with increasing distortion level of the initial learning set. Thus, the more difficult the initial learning was, the better able listeners were to generalize their learning to a subsequent set of even more distorted melodies.

Melody 1
Melody 2 Prototype Prototype Distortion Level 2 Distortion Level 5 Distortion Level 7 Distortion Level 7 It is important to note that although these findings suggest that listeners abstract melodic prototypes, this paradigm does not explicitly test such abstraction. This question was addressed in a subsequent recognitionmemory paradigm. Specifically, listeners received multiple blocks of study-test trials. In the study phases, listeners heard 9 melodies, consisting of 3 exemplars of a prototype at distortions levels of 3, 5, and 7. Following the study phases, listeners were tested with 20 melodies, consisting of the original 9 study melodies, 9 new distortions (3 new exemplars at these same 3 distortion levels), and 2 repetitions of the original prototype, and were asked to say whether the test melody had been presented in the previous study phase. Figure 6 displays the average accuracy for the old distortions, the new distortions, and the prototypes. Although clearly a difficult task for listeners, the most critical finding was that recognition accuracy for the previously unheard prototype was significantly lower than both the old and new distortions, and was significantly less than chance performance. In other words, listeners consistently misidentified the prototype as being heard in the previous study phase. Such inaccuracies would arise through abstraction of the prototype based on the distortions, producing the mistaken belief that this melody had been previously presented. Interestingly, this finding converges with Vuvan et al. (2014), extending the demonstration of false memory from single tones to entire melodies. More generally, these studies provide another example of the importance of abstraction in musical processing, demonstrating that listeners' memory for musical surface information can be significantly distorted by abstracted representational structure of these materials.

Discussion
This paper has provided an array of evidence that abstracting schematic structures is a fundamental process driving musical apprehension. This abstraction has been observed across a range of domains, and included a range of abstracted structures such as tonality, the co-occurrence of tonal and metric hierarchies, contour processes, and so on. In this regard, it is interesting that the process of abstraction has been so neglected in the auditory and music literature, with the notable exception of the voluminous work on statistical learning (e.g., Saffran & Kirkham, 2018).
The recognition of abstraction as a central process of musical processing leads to an array of questions for future work. For instance, what are the mechanisms involved in abstraction? Throughout this paper it has been assumed that simple exposure to musical patterns enables abstraction. This assumption, however, begs the question of the nature of such exposure. For instance, how much exposure is necessary to enable abstraction, and what types of experiences are necessary for such abstraction of patterns out of stimuli? And finally, what types of structure can be abstracted out of such stimuli? There are potentially an infinite set of organizational principles available for abstraction, yet individuals do not show equal facility in abstracting all such structures. As an example, consider 12-tone serial music, which presents a remarkably coherent theoretical structure available for abstraction by listeners. And yet, research with serial patterns has demonstrated that listeners are relatively insensitive to such structure (Krumhansl et al., 1987), suggesting that there is something in such structure that is simply not amenable to the abstraction process. Investigation of such issues, along with a host of related topics, has the potential to provide important additional insights into this fundamental process of musical abstraction.