Chapter 12: Speech and Music Perception

Chapter 12: Speech and Music Perception
Vocal anatomy:
Vocal folds (vocal cords) are located
with larynx). Air is vibrated as it passes
through then “shaped’ by vocal tract
(throat, tongue, teeth and lips) as it is
Note also, human larynx “descended”
compared to other apes, allows for
articulation of linguistic sounds
Studying the speech signal: Sound spectrograms
Visually depicts energy levels
at different frequencies for
speech sounds. Darker areas
indicate greater energy. Note:
widespread energy across 15K Hz for “I”; low frequency
energy for “miii”, hissing “sss”
more widespread energy,
then high frequency energy
for “y” which then drops to
lower “ooouuu”
Neurogram: sound
spectrogram could also be
understood as depicting
activity along BM over time
moving from base (hi freq) to
apex (lo freq).
Invariance Problem with phonemic signals
Phonemes are basic units of sound in language (/b/; /d/; /ch/ etc). However, their
acoustic signal is not invariant, so how do we identify them? To answer first must
review brain regions active in speech perception.
Brain areas involved in speech processing
PAC does not respond
preferentially to speech sound,
but WA (left hemisphere) does.
Damage to WA produce receptive
aphasia (inability to understand
speech). Damage to BA produces
production aphasia, inability to
produce speech.
Motor areas of brain also
activated in speech perception.
Analogous structures in RH
process emotional content of
speech; identify speaker from
cadence, intonation patterns, etc.;
not so much meaning of speech.
Damage to right WA can produce
phonagnosia – inability to
indentify speaker but still can
understand speech.
Auditory pathways from PAC
Similar to “what” and “where” pathways in visual system, two pathways
extend from PAC. Red, ventral pathway deals more with extracting
meaning from sounds, including phonemic sounds. Blue, dorsal pathway
deals more with processing rhythm and intonation patterns. Note: RH
shown in image, similar arrangement also found in LH
Back to identifying phonemes
How can we identify
phonemes if they are not
associated with consistent
acoustic signal?
Two Contextual factors:
Motor information: We
use motor information
(movements of face and
mouth, to disambiguate)
-- recall that motor cortex
is active in speech
-- McGurk effect: motor
information affects
Back to identifying phonemes
2nd contextual factor: Semantic context, what
makes sense?
EX: phonemic restoration effect. If a sound is
removed from a sentence, people still claim to
have heard a sound consistent with the
meaning of the sentence.
The *eel was on the axel (wh)/ shoe (h)
/orange (p)/ table (m)
Music Perception
Melody: the central aspect of music. Structured organization of pitches that create an
identifiable pattern.
Composed of absolute intervals, relative intervals, and pitch contour.
Absolute intervals: distance between pitches which in the leftmost example (in
Hungarian!) is one whole note
Relative intervals: relationship among intervals in a group of notes. In distortion one
(middle example) absolute intervals have been doubled (two whole notes now separate
each note, but relative intervals remain constant – remains 1:1 across all notes, all
notes separated by 2 whole notes).
Pitch contour: pattern of rising and falling pitches in a group of notes. In example below
pitch contour is always, 1<2<3. Note in distortion 3 (rightmost example), pitch contour
is preserved but absolute and relative intervals have been distorted.
Results: Most could still indentify melody in distortion one (relative intervals and pitch
contour present) and distortion 3 (only pitch contour present). When none present
most could not identify melody.
Absolute (“perfect”) pitch
Ability to correctly identify any note played. Ability seems to be
more present in younger aged kids then tends to diminish. One
hypo is that it might be detrimental to language development.
Language perception requires ignoring “absolute” tonal qualities for
relative or contextual ones.
Brain areas associated with music perception
Areas both anterior and posterior to PAC important for music
perception. RH anterior superior temporal sensitive to pitch
variation and pitch chroma (similarity between same pitches at
different octaves – middle vs. high C). Superior posterior temporal
more sensitive to pitch height (difference between high vs. middle
C). Left anterior temporal: especially sensitive to rhythmic qualities
of melody
Factors affecting emotional quality of music
tempo and
tend to have
effects on

similar documents