Chapter 10: Perception of sound

Report
Chapter 10: Perception of sound
If a tree falls in the
woods and there is no
one around, does it
make a sound?
3 requirements for
sound
1. Vibrating body:
something to create
mobile pressure
changes
Vibrating bodies
Vibrating bodies create pressure changes capable
of propagating from the source. It’s the pressure
change that serves as the auditory signal
3 requirements for sound
2. An elastic medium:
A substance capable of
propagating pressure
changes. Usually this is
air (but not always).
3 requirements for sound
3. Receptive organ: something
to translate physical pressure
changes into a perceptual
experience – usually ears.
Difference between physical
energy and sound (perceptual
experience)
Sound pressure wave
Physical properties and perceptual experience
Wavelength = cycles per second; Hz
Range of human frequency perception
Note: peak sensitivity around 3.5KHz; full range roughly 20-20,000Hz;
drops from top with age.
Sound pressure wave
Amplitude: height of wave; measured in dB
Sound pressure wave
Overtones: no pressure wave
occurs in isolation – overtones
are other frequencies that
occur along fundamental
frequency (frequency that
accounts for pitch perception)
that affect “character” of
sound perception: timbre. For
most musical instrument
overtones are harmonics
(multiples of fundamental).
Note: on graphs instruments
are not playing exactly the
same fundamental
Behavior of sound waves
While sound pressure waves are reflected and
absorbed variously by different surfaces, like sound
waves; they also can travel around, and through
surfaces, unlike light waves; which can make them
much more difficult to completely block out, hence the
ability to hear something even when it is not seen.
Echoes: reflected sound - different environments have
different echo characteristics or acoustics, hence the
sound quality of the environment varies. Generally
speaking the harder surfaces tend to reflect more
sound, while more porous surfaces tend to absorb
more sound.
Speed of sound: the speed with which a pressure wave
travels through the medium is determined by the
density of the molecules in the medium -- the denser
the medium the faster the propagation. Air is the most
typical medium for sound, and in air the speed of
sound is 340 meters per second. But sound waves
actually travel faster through water, ground, and even
steel.
Impedance: the degree to which the medium resists
the propagation of the sound wave. Denser mediums
tend to propagate sound waves faster, but they also
tend to reduce the amplitude of the wave more
quickly, thus reducing the perceivable distance of the
wave.
Receptive organ: Ear
• Ear: 3 major parts; outer, middle, inner
Receptive organ: Ear
1) The outer ear: structures
a) Pinna: fleshy, cartilaginous, structure which extrudes from head
Pinna is important for helping to funnel sound further into ear, and as gross sound
localizer.
b) Auditory canal: tube structure which directs sound inward to middle ear.
canal has resonance frequency of around 3,000 hz, which means that it tends to
vibrate along with frequencies of 3,000 and therefore amplify those sounds.
Interestingly enough, there are only modest number of speech sound which are in
the range of 3,000 hz, most are more in the range of 1,000-2,000 hz, however, high
pitch screams are around this frequency typically.
Middle ear
Tympanic membrane: eardrum (sometimes included with outer
ear) thin oval shaped membrane which vibrates in response to
incoming wave. Tympanic membrane is highly sensitive, but can
often absorb punctures and continue functioning. Main job is to
vibrate ossicles.
Middle ear
a) ossicles (malleus, incus, stapes). the
tiny bones of the middle ear which
vibrate in response to vibrating of
tympanic membrane.
Major purpose is to amplify the sound
wave to help reduce affects of increased
impedance of cochlear fluid.
Impedance matching device: about 4dB
recovered from hinge design of ossicles,
about 23dB from “funneling” from
tympanic membrane to oval window
b) oval window: connected to stapes,
vibrates in response to stapes and
propagates sound wave to inner ear.
Acoustic reflex: loud, low sounds trigger
stiffing of inner ear muscles restricting
movement. Not effective for high
pitches.
Inner ear
• Composed of
semi-circular
canals
(vestibular
sense – body
posture,
balance, etc)
and cochlea.
Cochlea is main
structure for
auditory info
processing
Cochlea
Three main structures:
1) Vestibular canal: topmost
section of cochlea
2) Tympanic canal: bottom most
section of cochlea
3) Cochlear duct: middle canal of
cochlea, filled with different
type of fluid than tympanic
and vestibular canals. Mixing
of fluids can impair hearing.
Also: Round window: small
elastic structure covering a
small opening between
tympanic canal and middle
ear. This structure helps to
equalize pressure from
propagated wave started at
oval window.
Cochlea
Basilar membrane:
membrane separating
tympanic canal from
cochlear duct.
Organ of Corti: auditory
receptor organ which rests
on basilar membrane inside
cochlear duct. Is to ear
what retina is to eye.
Tectorial membrane: the
membrane that extends up
from Riessners membrane
(the diagonal membrane
which separates the
vestibular canal from the
cochlear duct) and arches
over and contacts some of
the Organ of Corti hair cells.
Cochlea
Organ of Corti hair cells:
there are two types:
inner and outer. Inner
cells are less in number
(4,5000) and are situated
near where the tectorial
membrane attaches to
Riessner's membrane.
Inner are not directly
connected to tectorial
membrane.
Outer cells are greater in
number (15,500),
situated more centrally
on Organ of Corti, and
are connected to tectorial
membrane. However,
outers have very limited
connections to auditory
nerve (95% of auditory
nerve connected to IHC)
Action in Cochlea
Wave enters from the piston-like action of stapes moving in and out of oval
window. Wave throughout cochlear fluid and displaces basiliar membrane in
cochlear duct. The waving motion of basilar membrane causes tectorial
membrane to displace in opposite direction of basilar membrane and get "pulled
and tugged" by connections to outer hair cells. This "pulling and tugging" action
amplifies the movement of fluid in cochlear duct which causes displacement of
inner hair cells, which have many direct connections to auditory nerve.
Theories of pitch perception: Temporal theory
This theory (also called frequency theory) states that the entire
basilar membrane vibrates in consonance with the frequency of the
wave entering the cochlea. This idea was subsequently proved
incorrect as it was found that the differences in the width and
thickness along the length of basilar membrane made it physically
impossible for it to vibrate as frequency theory predicts. However
it was found that individual auditory nerve fibers could match low
frequency vibrations, and could volley to match frequencies up to
about 4,000 hz.
Theories of pitch perception: Place theory
First proposed by Herman von Helmholz, who noted that the basilar membrane was narrow at the
base and wider at the apex. Helmholtz believed that this meant that the basilar membrane was
composed of separate fibers which resonated at different frequencies along the basilar membrane,
like a piano keyboard.
Place theory found support in studies by Von Bekesey, who constructed a replica of the basilar
membrane to study the behavior of the waves inside the cochlea. Bekesey found that different
frequency waves peaked out at different places along the basilar membrane with high frequencies
nearer the base, and low frequencies nearer the apex.
However, Bekesey also found that localizing the place of maximal stimulation was much more
precise for high rather than low frequencies.
Duplicity theory
A combination of frequency and place operate to
explain the range of human pitch perception -- and
varying sensitivities to pitch.
20 to 500 -- frequency coding only
500 to 4,000 -- frequency and place coding
4,000 to 20,000 -- place coding only
Note that it is frequencies from around 1,000 - 3,000
for which humans have greatest sensitivity and in
which comprises most of human speech.
Auditory nerve
Made up of
about 30,000
individual
fibers mostly
emanating
from IHC.
Nerve fibers
differ in
spontaneous
activity
(baseline firing
rate)
depending on
where they
make contact
with IHC
OHC
Hi spon. activity
IHC
Med spon. activity
Lo spon. activity
Frequency tuned auditory nerve fibers
Suppose we present different frequencies at minimal dB level to
individual nerve fibers.
Frequency preference corresponds to location on basilar membrane
But what about loudness perception?
Loudness perception: Where does fiber connect to IHC?
B graph shows two fibers
from same location on basilar
membrane (therefore same
frequency preference). When
preferred frequency is
presented at different dB
levels to each with different
spontaneous activity levels.
Higher responds to lower
intensity (lower threshold)
but has lower saturation
point. Lower (darker line)
responds “later” but
saturates later as well.
Auditory processing beyond cochlea
At left and right cochlear nuclei auditory processing is monaural; but past
(superior olives; inferior colliculi etc.) processing becomes binaural. Thus,
“two-eared” cues for sound localization can be exploited.
Sound localization: Cue 1 – interaural time differences
L ear
Delay line
R ear
Binaural
cell
Direct line
• Inter-aural time differences:
the difference in arrival time of
sound wave at two ears. Sound
arrives at nearer ear first
(when not perfectly at midline).
Probably coded by binaural
cells with variable time delays
(delay lines) built into inputs
from nearer ear (a).
It appears that time
differences are more effective
cue for lower frequencies,
while amplitude differences
are more effective for higher
frequencies.
Sound localization: Cue 1 – interaural intensity differences
The difference in
loudness at the two ears
created by shadowing
effects of head and
pinnas, as well as
differing distances of
sound producing source
from two ears.
Shadowing effect is far
less for lower
frequencies, which are
often large to go around
head unblocked.
Auditory cortex
• Tonotopic organization with magnification of mid-range
frequencies. Beginning of processing for more meaningful
and categorical (speech vs. dog bark) aspects of audition.

similar documents