Does cognitive load differ among sight-singers? An exploratory study using pupillometry and interviews

Sight-singing is challenging for many music students, yet they can experience various difficulties with this task. To explore how cognitive load (CL) might differ among students, we combined two approaches: 1) a quantitative approach using pupil size diameter—a psychophysiological indicator of CL—to see whether CL differed as a function of sight-singing achievement and experience; 2) a qualitative approach to learn student’s challenges when sight-singing and verify how such challenges reflect on sight-singing scores. We asked 56 post-secondary music students to complete a musical background questionnaire and to complete a sight-singing exercise, while an eye-tracker gathered data about their pupil size. After that, we interviewed them about the difficulties they experienced. The results revealed that CL did not vary between sight-singing performance and musical experience levels. However, we found a tendency suggesting that students with the highest intonation scores and lowest intonation scores both experienced a lower CL. On the contrary, CL was higher for students with average intonation scores. Interviews also revealed that many students experienced information overload while sight-singing, and students who shared such perception obtained, on average, lower sight-singing scores. Future studies should include qualitative data collection to deepen our understanding of learners’ experiences.


Introduction
Sight-singing is an essential dimension of aural skills classes included in music programs in higher education. However, sight-singing can be challenging for many students: some have trouble reading music (Asmus, 2004), suffer from a lack of preparation (Anderman, 2011), or experience anxiety (Buonviri, 2014;Fournier et al., 2019). Furthermore, students begin their studies with various musical backgrounds (Buonviri, 2015; Teixeira dos Santos & Puchalski dos Santos, 2020). Consequently, students might experience differently the tasks their instructors choose. Indeed, some students will sight-sing easily, while others will have to put a lot of effort into this task.
Cognitive load (CL) is the relationship between a task's demands and the mental resources available (Wickens & Hollands, 1999). A higher CL can hinder improvisers' creativity (Norgaard et al., 2016), instrumentalists' expressivity , and singers' timing (Çorlu, Maes, et al., 2015). To our knowledge, no study so far has investigated the relationship between sight-singing performance and CL. However, it remains unknown whether pupil size can also fluctuate as a function of sight-singing achievement or musical background.
Advanced musicians usually sight-read better. For example, Kopiez and Lee (2006) found that sightreading experience acquired before the age of 15 was a strong predictor of sight-reading performance. Also, Arthur et al. (2020) found that sight-reading expertsthose able to play a 6 th Grade sight-reading exercise from the Australian Examination Board-were more likely to have had formal training for more than 10 years and to have begun learning music before the age of seven. One possible explanation is that experienced musicians can access schemas from their long-term memory and, therefore, process the score more easily (Sheridan et al., 2020). The amount of previous musical experience is also related to better sight-singing performance (Fournier, 2020). Nevertheless, the question as to whether sight-singing requires less effort-i.e., imposes a lower CL-for experienced musicians remained unanswered.
This study aimed to determine whether CL varied as a function of musical experience and sight-singing performance. We also wanted to know which challenges, notably related to mental effort and subjective perception of CL, post-secondary music students experienced while sight-singing and if they could relate to sight-singing performance.
One way to assess CL objectively is to measure pupil size. Variations in pupil diameter are deemed to reflect changes in CL (Beatty, 1982;Einhäuser, 2017). For music reading, pupil size varies depending on task difficulty. For example, pupil size tends to be larger in harder tonalities (Chitalkina et al., 2020) or when reading unusual chord progressions (Hadley et al., 2018).

Participants
After obtaining ethical approval, we recruited 56 music students from three post-secondary institutions in the authors' urban area. All participants had normal or corrected-to-normal vision. As compensation for their participation in the study, they were offered free aural skills tutoring by the first author.
Of that number, 39 were students from CEGEP level-a 2-year post-secondary training in Quebec between high school and university-and 17 were university students. Participants were between 17 and 67 years old (M = 22.88, Mdn = 19.00, SD = 11.03). They had accumulated between 3 and 26 years of musical experience (M = 10.61, Mdn = 9.00, SD = 5.22). Regarding the main instrument played, 18 reported a harmonic instrument (e.g., piano), while 38 reported a non-harmonic instrument (e.g., trumpet) or voice.

Material
Using Google Forms, participants first completed a homemade survey aiming at gathering information about their musical background. It included open-ended questions about when they began learning music, the instruments they played, their main instrument, their post-secondary education, and their number of years of musical experience. The sight-singing exercise consisted of an 8-measure, medium difficulty melody, adapted and transposed from École préparatoire de musique de l'Université Laval (1999).
The survey and the melody were presented on a Dell Precision T5810 computer screen. While they sang, a FOVIO eye-tracker recorded their pupil diameter. The eye-tracker was on the desk and centered below the computer screen. Eye-tracking data were processed with the software EyeWorks (EyeTracking Inc., 2019). The sampling rate was 60 Hz. A Yamaha NP11 electronic piano keyboard was located in front of the participant. The semi-structured interview included questions about the students' difficulties with sight-singing.

Procedure
The experimenter met students individually for a single session in a dimly lit soundproof room. They first completed the questionnaire, which lasted about 15 minutes. After that, the experimenter launched EyeWorks and assisted participants with the calibration of the eye tracker. Instructions for the sight-singing task appeared on the screen, followed by the score. Participants could play the starting pitch on the keyboard and could rehearse mentally for as long as they wanted. Their performance was audio recorded. This segment of the data collection lasted from 5 to 10 minutes. After they completed the task, the first author came back into the room to conduct the interview, which lasted approximately 10 minutes.

Scoring and data preparation
We used five dimensions of sight-singing performance. Three of them were objective: pitch, rhythm, and combined scores. Each note was worth two points: one for pitch, one for rhythm. The combined score was the sum of the two. Two measures were subjective and assessed on a four-point scale: rhythmic fluidity and intonation accuracy. The experimenter rated the recordings, and an aural skills teacher with a Ph.D. in Music Education scored ten sight-singing performances to validate our rating scales. Both scoring correlated strongly and significantly, for the combined score, r(8) = .987, p < .0001, the rhythmic fluidity score, r(8) = 0.962, p < .0001, and for the intonation accuracy score, r(8) = 0.871, p = .001. Therefore, we considered the experimenter's scoring as valid.
We averaged pupil size diameter between both eyes. Measurements were restricted to the exercise's four central measures because we wanted to obtain data for the sight-reading task's cognitive load while accounting for the time the pupil takes to adjust. This portion of the sight-singing task lasted about eight seconds.
We excluded three participants whose pupil diameter data was more than 1.5 times outside the interquartile range for analyses involving cognitive load, as well as 11 subjects for which we did not have pupillometric data for the time interval we studied.

Analyses
We conducted statistical analyses with RStudio (R Core Team, 2019), with the packages lsr (Navarro, 2015) and car (Fox & Weisberg, 2019). We created groups to compare different levels of sight-singing (low-, average-, and high-performing groups. We made those three groups based on quantiles for rhythm score, pitch score, and combined score to obtain approximately equal groups. We assessed rhythmic fluidity and intonation accuracy with scales ranging from zero to three, and we used these scores to create groups to compare. We collapsed the two highest scores and created three groups for these variables too.
We used a similar process to compare groups based on musical experience. We used quantiles to create groups based on age when participants began learning music and number of years of experience. For the academic level, we compared students from CEGEP and university. For the main instrument, we compared students who played a harmonic instrument with students who played a non-harmonic instrument.
With regards to the interviews, we conducted a thematic content analysis (Krippendorff, 2013). We listed every theme related to challenges students faced when performing the sight-singing task and, more generally, sight-singing in their aural skills classes.
Moreover, we conducted ANOVAs to compare pupil diameter between groups based on the age when participants began learning music (participants who began before eight years old: n = 20; between eight and ten years old: n = 9; after ten years old: n = 13). We also compared pupil diameter based on the number of years of musical experience (participants with less than eight years of experience: n = 17; between eight and eleven years of experience: n = 10; and with more than eleven years of experience: n = 15). These analyses revealed no significant difference between groups. After checking for variance homogeneity, we then conducted t-tests to compare pupil diameter according to study level (CEGEP, n = 28 vs. university, n = 14) and primary instrument type (harmonic, n = 15 vs. non-harmonic, n = 27). No significant differences arose between groups.

Figure 1: Average pupil diameter (in mm) as a function of intonation accuracy. The bold line in each box indicates the median, while red lozenges show the mean for each intonation accuracy level.
Interviews about students' difficulties with the sightsinging task revealed that at least 20 participants had difficulties managing multiple information types simultaneously, indicating that they felt high levels of subjective cognitive load, as Participant 3 reports: "…there was just too much information, I was overloaded, and then I couldn't keep up." More precisely, singing the right pitch while keeping up with the rhythm is also an issue, as described by Participant 16: "I was just messing up with the rhythm, even if it was simple. Then I focused on the pitches, and I knew I was messing up, so I just dropped the ball…so rhythm was nonsense." Making conducting gestures while singing to help maintain the pulse is also hindered by this overload, as Participant 54 suggests: "First time I sight-sing a melody, I can't conduct because it confuses me, it's too much to handle." Other challenges many participants shared included difficulty to sing intervals (29 participants), understanding rhythm figures (25 participants), and lacking experience with sight-singing (20 participants).

Discussion
In this study, we wanted to verify if cognitive load (CL) while sight-singing varied between students as a function of sight-singing performance and musical experience. We also wanted to explore which difficulties they experienced during the sight-singing task. In terms of sight-singing performance, we did not find significant pupil diameter differences between participants for rhythm, pitch, and combined scores. We also did not find significant differences based on rhythmic fluidity and intonation accuracy. In terms of musical experience, we also did not find significant differences in pupil diameter based on the number of years of experience, the age when participants began learning music, their level of study, or their primary instrument. These results suggest that cognitive load does not vary as a function of sight-singing performance and musical experience.
Although not significant, the differences we observed in pupil diameter as a function of intonation accuracy performance, i.e., that pupil size diameter was lower for the low-and the high-performing group and higher for the average performing group is something that we should explore more in-depth in the future. It suggests that there might be an ideal CL when sight-singing. A cognitive overload could be detrimental, as other studies demonstrated (Çorlu, Maes, et al., 2015;Norgaard et al., 2016), but a load that is not sufficiently high could reflect lower engagement in the task. Consequently, this could explain how a lower load could be linked to poorer results. Pupil size diameter differences could be tested with a larger sample to see how it relates to intonation accuracy. Indeed, due to data loss during the short interval for which we measured pupil size diameter, we could only conduct our analyses with 42 participants. Furthermore, this problem could have been avoided with a more extended task, allowing more time to obtain pupil size data. Future studies could also use sightsinging tasks of various difficulties to manipulate cognitive load and measure its impact on performance.
Interestingly, participants' testimonies suggest that cognitive overload is an issue when sight-singing. It can even be an obstacle to using strategies many authors view as efficient, like using conducting gestures to help maintain the pulse (e.g., Karpinski, 2000). Indeed, some students might not have sufficient cognitive resources to simultaneously read music, sing, understand rhythms, and maintain the pulse with gestures that are potentially not fully automated. From a pedagogical standpoint, this suggests that learners should rehearse some difficulties separately (executing rhythms, keeping a pulse while conducting, singing chords) before they are integrated into more complex sight-singing exercises.
Because we found that many students reported problems managing multiple information sources simultaneously, suggesting a feeling of cognitive overload, we investigated if sight-singing scores were lower for students who talked about such difficulties. We found participants who reported a higher subjective CL had lower sight-singing scores for some dimensions only: pitch, combined score, and rhythmic fluidity. It could be because when the task is too hard, it is still possible for students to maintain correct rhythmic durations, but pitches might be erroneous and rhythmic fluidity might suffer. Future studies could measure the impact of a higher cognitive load of sight-singing performance, for example, in adding progressive difficulties to a task and observing which sight-singing performance dimensions are affected. It also seems that qualitative reports can deepen our understanding of how students experience mental effort, and studies about cognitive load in that context should include them.

Conclusion
Our results suggest that musical background and sightsinging performance are not related to cognitive load when sight-singing. However, participants reported feelings of information overload when singing, which should be taken into account by instructors while designing learning activities, as it can be associated with lower performances.