The Categorical Perception of the Music Scale
a Challenge before the Microtonal Music
Ivan Kostadinov Yanakiev
Educational Centre, Bulgarian Academy of Science, Bulgaria
Educational Centre, Bulgarian Academy of Science, Bulgaria
The author expresses his gratitude for the financial support of the Bulgarian Academy of Sciences’ programme for support of the young scientists (2017-2019), to Bulgarian Academy of Sciencs' Educational Centre and to prof. D.Sc. Milena Bozhikova for the scientific support and mentoring
Yankiev, Ivan K. 2018. "The Categorical Perception of the Music Scale: A Challenge Before the Microtonal Music." Accelerando Belgrade Journal of Music and Dance 3:3
The text examines the phenomenon of categorical perception of musical pitch as defined by John Sloboda (1999), Jane A. and William Siegel (1977), Stefan Koelsch (2012), and William Yost (2013), in their researches in the field of music psychology. The paper states the hypothesis that the current system of dividing the octave in twelve equal semitones do not employ the human physical capabilities for defining pitch to their full extend. On the contrary, the reviewed literature testify for the existence of a strong tendency to categorically label and perceive non equally tempered intervals with different but close to each other magnitudes (widths) as the same. This tendency is stronger in professionally trained musicians than in non-musicians. A short historical excursion to the 21 tone 1/6 syntonic coma meantone temperament, recommended by both Leopold (1856) and Wolfgang Amadeus Mozart (1965), is included as an example for better utilization of the musicians’ potential to distinguish pitch and intervals, which had been employed in practice. The text continues with a brief overview of the theory behind generating intonations and temperaments, based on the Equal Division of the Octave (EDO) method. Finally, a short exemplary reference to Kyle Ganns’ cycle “Hyperchromatica” (2015) is made alongside with quoting his personal attitude and commentary towards the performers’ general interest for microtonal music. The paper concludes that the categorical perception of pitch in the context of the twelve-tone equal temperament may be regarded as the main challenge, which the microtonal music is facing.
Keywords: categorical perception, microtonal music, musical pitch, psychology of music, musical temperament, edo, kyle gann
What we refer to as music is generally an invariant of the result of about a thousand years of theoretical and aesthetic research. The general understanding of music as Music is to a very high degree related to the attitude towards one of its innate qualities – the pitch – and whether it is or is not in tune. For the purpose of this text we will not be referring to the 1950 onwards post-war avant-garde’s examples and the ventures into expanding the understanding of what musical object sonore (Schaeffer 1966) can be. For a pitch to be perceived as being in tune the listener has to be able to compare it to some previous knowledge of the right one. In Western European tradition, this previous knowledge is in a direct relation to a handful of well-established systems for “properly” dividing the octave.
Usually in our contemporary conditions, the twelve-tone equal temperament is tacitly accepted to be the norm. There is no explicit document where the 12 tone Equal temperament is officially (i.e. by any officials, standardization institute etc.) accepted as the norm for establishing the frequencies of the individual tone. Contrary to that ISO16:1975 officially defines the value of the a above middle c to 440 Hz ± 0,5Hz. (see ISO16:1975). More to this, any deviations from the chosen system, especially during the educational period of any aspiring musician, are considered to be erroneous and even in some cases, like during competitions, auditions, final examinations etc., punishable. In other words, the process of indoctrinating the right intonation is in fact a forceful enculturation methodic. This is valid not only to nowadays ear training, which represents the diatonic scale (pure fifths) as a mapping of the twelve-tone equal tempered scale (tempered fifths), but also during earlier periods. Leopold Mozart’s XVIII century treaty Versuch einer gründlichen Violinschule (1756) read as follows:
Auf dem Clavier sind Gis und As, Des und Cis, Fis und Ges, u.f.f. eins. Das macht die Temperatur. Nach dem richtigen Verhaltnisse aber sind alle die durch das (b) erniedrigten Tone um ein Komma hoher als die durch das (#) erhoheten Noten.“ , (...) “das (b) erniedrigten Tone um ein Komma hoher als die durch das (#) erhoheten Noten. Z.B. Des ist hoher als Cis; As hoher als Gis, Ges hoher als Fis, u.s.w. Hier muss das gute Gehor Richter seyn: Und es ware freilich gut, wenn man die Lehrlinge zu dem Klangmasser (Monochordon) fuhrete (Mozart 1756, 66, note).
In this sense Leopold Mozart was advising his students to develop a deeper understanding for the difference between the enharmonic flats and sharps (Db and C#) in the context of the 1/6-comma meantone temperament.
William Yost (2013) described in detail the complete physical, physiological and neural aspects of hearing in his book “Fundamentals of Hearing. An Introduction”. For frequency discrimination he gave a figure where the value of ∆f required to just discriminate between two different frequencies (Ibid., 150-151). The data were shown for five different sensational levels (in dB) and his general conclusion was that over significant range of frequencies the Weber fraction (∆f/f) had a constant value of about 0,002. Therefore, if we need to calculate what the minimal difference in frequency ∆f of two adjacent tones would be in order the tones to be distinguished from one another, we can use the given formula. Yost gave an example with 500 Hz: ∆f/500 = 0.002 ∆f = 1Hz. Of course, he was talking about pure sinusoid tones at sensation level of 40 dB (the average distraction point of concentration). In term of music interval, this 1 Hz difference from the 500 Hz tone will be 3,459 cents of a twelve-tone equally tempered semitone (Ibid.).
This observation of Yost’s confirms that our physical sensors are capable of recognizing very subtle differences in pitch. The current musical practice, however, does not employ this capability to the full extend. This raises an important question – what could possibly be the reason for labeling everything not compliant with the twelve tone equal temperament as “out of tune” but not measuring how much out of tune it is? This attitude of limiting the hearing to the pitch categories of the twelve tone equal temperament’s scale does not expand the musicians’ capabilities of pitch discrimination towards their limits. As we are going to see this has an effect later in the aesthetic domain (see section “The categorical perception as a challenge”).
Categorical perception of the musical scale
The idea of defining the perception in the musical domain as categorical comes by analogy from the linguistic domain. Siegel and Siegel (1977) write that “[…] speech processor consists of specialized linguistic feature detectors that are “tuned” to the phonemic distinction of a language […], and as a result, acoustic variations of the auditory stimulus irrelevant to meaning are filtered out. […] categorical perception, [is] a process whereby continuous acoustic variation is transformed into a discrete set of auditory events […].” The study described in the above-mentioned Siegel and Siegel article “Categorical perception of tonal intervals: musicians can't tell sharp from flat” is a pioneering one, which suggests that the categorical perception extends beyond the limits of the linguistic continuum. The authors empirically demonstrated that among musicians with top rated relative pitch a well-established tendency to categorically perceive and discriminate intervals was observed. The subjects were asked to grade 13 different intervals (from 480 c to 720 c with step of 20 cents - see Example 1). The results suggested that both the perception of the magnitude of the interval (its width – the qualitative characteristic) and the interval labeling (fourth, triton, fifth – the quantitative characteristic) exhibited strong categorical tendencies. The standard deviation was at its peak on the intervals, which fell in-between the “standardized” twelve tone equal temperament scale. However, the magnitude evaluation had not shown intracategorical discrimination of the width, but it supported the categorization tendency. (Ibid.).
The categorical perception of musical pitch has been discussed also by John Sloboda in his book “The Musical Mind” (1999). He supported the thesis that the categorical perception of pitch was strongly influenced by the culture, which formed the musical understanding of a person. He gave an example of the incapability of some representatives of one culture to apprehend the other culture’s musical scale with the jazz music. “[T]he ‘blue notes’ in the jazz scale came about through the efforts of African musicians (from a culture using pentatonic scale) to assimilate the diatonic scale of North American culture. [..] the proposed diatonic scale would be C, D, F, G, and A. Users of the pentatonic scale would have no normal representation for the diatonic E and B […] Africans would have heard them as ‘mistuned ‘notes falling somewhere between D and F or A and C […] producing something that would sound unstable and mistuned to the Western ears.” (Ibid., 25).
Sloboda also quoted another study dealing with the categorical perception in musicians and non-musicians by Simeon Locke and Lucia Kellar (“Categorical perception in a non linguistic mode.” Cortex 9, December 1973 (4): 355-369, quoted in Sloboda 2011, 25-27). It examined whether an A major chord would be categorized as minor or major when varying the frequency for the C/C#. The frequencies of the tones were - A = 440 Hz, E = 659 Hz; the frequencies for the C were between 523 Hz (300 c) and 554 Hz(400 c) (see Example 2). The results were that “[a]lmost all chords with middle notes above 546 Hz were heard as A major. Almost all chords with middle notes under 540 Hz were heard as A minor. The evidence suggests a categorical boundary at about 543 Hz” (Ibid., 25). (The 543 Hz boundary position between the major and the minor third (300 c – 400 c) is at 0,52, provided that we accept that the minor third is at position 0 and the major at 1. The third at this position has a width of 364,1 cents).
The non-musician results differed a little but also showed traits of categorical perception. As a reflection on this study, Sloboda (Idem., 27) pointed out the three milestones of the categorical perception of pitch:
A practical example of those milestones would be an attempt of a violinist, trained in the classic tradition, to dive into the representation of micro chromatic Arabic or Eastern Asian music (Raga, Gamelan). The search on the fingerboard of the fretless instrument of some Maqams’ subdivisions (Bayati, Husam, Saba) would be a difficult task for the chromatically trained player. However, the both existence of the diatonic scale in relation to the pentatonic musical paradigm of the African musicians and the existence of an irregular subsemitonic scale in comparison to the twelve-tone equal temperament, is enough to testify that the human hearing apparatus is capable of perceiving and reproducing microchromatically subdivided sound systems.
In the studies, mentioned above, on the categorical perception of musical pitch the results of the researches that were carried out with participants who were professional musicians confirmed that classically trained musicians were taught to perceive music in the categories of the twelve tone equal temperament. The existence of divisions of the octave different from the twelve tone equal temperament suggested a hypothesis that the introducing a reference interval will after time generate a new category which has a certain qualitative magnitude around its central element. At the end of the next section this hypothesis will be further reviewed.
Pitch and the Brain
In the Stefan Koelsch’s section “Towards a New Theory of Musical Psychology” and especially the chapters “Musical Syntax” and “Musical semantics” of his book Brain and Music (2012), we can find very well organized theory of how our previous knowledge affects our perception. According to his research the processing of musical content was divided into eight hierarchically stacked layers. These layers were further subdivided into specialized procedures, which were required to transform physical stimuli into psychological effects. The first three layers were “Music perception”, “Syntactic processing” and “Musical meaning”. The “interval analysis” and “structure building” processes were parts of the “Music perception” layer, which generated potentials used as source for the next layer of “Syntactic processing”. Here the processes of “formation of musical expectancy” and “structure building” (now on a syntactic level) define the essential proportions of and the relations within the scale – the modal structure. The results of the analysis of the syntactic structures were transferred onwards to the next level of “Musical meaning” analysis for processing “symbolic meaning” and “intra-musical meaning” of the stimuli. Koelsch explained that each level and its processes were characterized with specific brain activity in specific region. When he examined Event Related Potentials (ERP) with Electroencephalography (EEG) the tendency was to find correlates in the electric potentials of the brain cortex. “Music perception” level processes at the fastest rate of lower than 9 ms for the FFR (Frequency-Following Response – a number of studies have recently investigated decoding of frequency information in the auditory brainstem using the FFR; the FFR can be elicited preattentively, and is thought to originate mainly from the inferior colliculus. The research findings confirmed that the correlation between the FFRs and the properties of the acoustic information is modulated by musical training.), ERAN (Early Right Anterior Negativity) with about 220 ms latency for not complying the expectancy in harmony, and at about 100 ms for not complying melodic line expectancies, and N5 (500 ms) for processing intra-musical meaning (Koelsch 2012, 89-185).
If we agree that it is reasonable to follow Stepahn Koelsch’s theory we can read further to find that the data he presents for the first three layers of processes completely underlines the empirical studies presented to us in Sloboda’s book. In his chapter 9.8 “Effects on musical training” (Koelsch 2013), Koelsch even says “Both long term and short term training modulate music-syntactic processing, as shown by effect of musical training on the ERAN, the LPC/P600 and the P3. […] ERAN is larger in musicians […] and in amateur musicians compared to non-musicians.” He continues in the next paragraph “This is in line with behavioral studies showing that musicians respond faster and more acutely to music-structural irregularities […] The ERAN is presumably larger in musicians because musicians have (as an effect of the musical training) more specific representations of music-syntactic regularities and are, therefore, more sensitive to violation of these regularities […]” (Ibid., 149-151).
In the spirit of Koelsh’s theory we may then briefly review the goals of the pitch recognition training as an attempt to refine the boundaries between the single discrete tone and especially interval categories. Its final destination would be to tune the pitch recognition matrix of a musician to the modern phenomenon of the twelve-tone equal tempered scale.
In this sense any division of the scale is learned – if we take for example the1/6-comma meantone temperament in its 21 tone version (suggested both by Leopold Mozart (1756) and Wolfgang Mozart (1965, 1-11), which intonation was highly praised by the XVIII century musicians, we could see that they did not only master the different intervals, but also taught students into recognizing them. Nowadays this interest has declined and the general acceptance of the twelve-tone equal temperament has become sufficient. (When all the intervals are equally out of tune compared to the perfect Pythagorean ones and the sound environment is overtaken by this general sonority, the result is that the reference of the pure sounding fifth is to a great extend lost gradually in the sounds of the past.)
Those conclusions based on the above excerpts from Stefan Koelsch’s book confirm the previously expressed hypothesis that in order to generate a new category in the pitch perception domain a new reference interval should be introduced. The 21 tone 1/6 comma meantone temperament, to which Mozart was an advocate, is a practical example of the validness of this hypothesis.
The current microtonal State-of-Art
In the second half of the XX century the microtonal music has been revived. Harry Partch, Kyle Gann, Lou Harrison, Joseph Monzo, to mention just some of the names of the composers and theorists, who employed microtonal intervals in their compositions and also gave plenty of theoretical background for how differently they treated the subject of musical pitch. Most of them developed their own scales and even own instruments (like Partch).
Scales and divisions of the octave in use
Example 3. Chromatic and diatonic mapping of 55 EDO, 31 EDO, 7 EDF
The categorical perception as a challenge
As we reviewed the psycho-physiological background of the categorical perception and also the general outlines of the theory underlying the modern ventures into the field of microtonal music, we can safely say that the main issue, which stands before composers, performers and to some extend before the listeners, is the subject of categorical perception. The main problem which arises is that the intervals in most of the used microtonal system do not correspond to the widths of the intervals in the 12EDO (twelve-tone equal temperament) scale. Here a general concept of three levels of understanding will be defined:
As we previously understood from Siegel and Siegel (op. cit.) and Sloboda (op. cit.), even the trained ear of the musician will not mind the intracategorical differences, so what we should say about untrained ears, as they do not even show strong categorical discrimination of intervals. We may state that we are perfectly safe with presenting microtonal music to the general audience – although they might not notice its special features in the way they are intended to sound.
For a player to step into the realm of microtonal music, he must be able to produce such intervals. Electronic keyboard instruments makes this possible to some extend by mapping the new intervals to existing keys. However, this generates new issues related to the technical side of music performance. For acoustic fretless instrument and to some extend for some wind instruments it would require additional ear training and also sometimes developing of additional techniques for playing those instruments. In this sense there are numerous reported instrument extensions and modifications. Jeff, Smith’s fluid piano, Fokker’s Organ, Hary Partsh’s 43 EDO sintruments (diamond marimba, cloud-chamber bowls, eucal blossom, bamboo marimbas “Boo I” and “Boo II”, quadrangularis reversum etc.) microchromatic quitar by Tolgahan Çoğulu, electroinc: 106 EDO Linnstrument, Tonal Plexus’ microtonal H-Pi Instruments, Roli’s Seaboard, Willson’s Microzone u-648. Nevertheless, in all cases a deliberate period of both ear training and instrumental practice would be required.
For composers the subject becomes even more complicated because not only a period of intensive ear training would be required but also a deeper theoretical and empirical submerging into different microtonal scales, which on its side requires a wider set of knowledge in music theory, music acoustics and mathematics.
A short example from the music of the microtonal composer Kyle Gann (USA) will be introduced. If we intend to listen, review or research any of the seventeen parts from his cycle Hyperchromatica (Gann 2015-17), a piece for 3 microtonally tuned pianos, which use 13 limit just intonation with 33 notes per octave, we will soon be facing the challenge of the categorical perception of pitch. Perhaps this is the reason why Gann chose to specifically denote his piece “for three microtonally tuned Disklaviers (computer-driven pianos)” (Ibid.). By eliminating the human factor of the interpreter he had the freedom to treat music according to his understanding. The dedication of the piece speaks for its own “Dedicated to all those musical performers who have ignored my music and inspired me to become self-sufficient” (Ibid.).
The categorical perception of pitch in our contemporary 12 tone equal temperament modality is for sure the main challenge which the microtonal music is facing. In sense of physical limits of perception, generally we possess the required apparatus in order to discriminate very narrow intervals up to about 3,459 cents (this is 346 EDO). However, the psychological studies confirmed that there is a strong tendency by the musicians not to use their discrimination ability to its full limits, due to training in the context of the twelve tone equal temperament. Accordingly, the most important difficulty which stands as a challenge before the microtonal music, is the one of a categorical level: to find a proper way to refine the boundaries of the intervals and to define new ones. This will make sure the idea of microchromatics get out of its current functional and even mystical status of barely refining the expressive traits of one’s intonation during performance. However, a deeper research into establishing whether a specially devised ear training methodic would enhance the perception of intervals or at least the reception of microtonal music as general is required.
NOTE: All of the examples are created by the author. This text outlines the starting point of the author’s project “Intonation and Temperament Systems in the XX-XXI century – Theory and Practice” which foresees a more extensive research in this section of the musical domain.