Information Density and Word Order

Information Density and Word
Why are some word orders more
common than others?
• In the majority of languages (with dominant
word order) subjects precede objects
• (SOV,SVO) > VSO > (VOS, OVS) > OSV
Why are some word orders more
common than others?
• Genetically encoded bias?
• Single common ancestor (SOV)?
• General linguistic principles
– Theme-first
– Verb-object bodning
– Animate-first
• Great, but why do these principles work?
Uniform information density hypothesis
• Constant information transmission rate
– Slower for unexpected, high entropy content
– Faster for predictable, low entropy content
• The basic word order of a language influences
the average transmission rate
• Thus languages that are closer to the UID ideal
will be more common compared to others
further away from it
Word-order model
• Simple world with
– 13 objects (O)
• 5 people
• 8 food/drink items
– 2 relations (R)
• eat/drink
• Events in this world consist of one relation and
two objects
– (o1, r, o2)
• And appear with a certain probability P
Word-order model
• Base entropy (the initial state of the observer
before words are spoken)
• After each word, observers adjust their
expectations for the following ones, reaching
an entropy of zero after the third word of the
Word-order model
• Each event has an information profile
I1 = H0 − H1 , I2 = H2 − H1 , I3 = H2
• Where Hn are entropy trajectories of each
• UID suggests a straight line from base entropy
to zero entropy such that each word conveys
1/3 of the total information
Word-order model
• UID deviation score
• Deviation of toy-world events from the “ideal
information profile” according to UID
Corpus study
• Child-directed speech (English and Japanese
• Utterances involving singly transitive verbs
• Ignored adjectives, plurality, tense etc
• English: VSO (0.38), SVO (0.41), VOS (0.48),
SOV (0.64), OSV (0.78), OVS (0.79)
• Japanese: SVO (0.66), VSO (0.71), SOV (0.72),
VOS (0.72), OSV (0.82), OVS (0.83)
• Languages must be optimal with respect to the
frequencies of events in the real world
• Judgement tasks for pairs of sentences (which
one is more probable?)
• VSO (0.17), SVO (0.18), VOS (0.20), SOV (0.23),
OVS (0.23), OVS (0.24).
• Object-first word orders are rare
• Object-first word orders have least uniform
information density (first word carries too much
• SOV is not as compatible with the UID as it is
frequent in real languages – perhaps due to other
important factors beside UID
• TFP and AFP favor SOV, SVO (highest ranked in
the results) and VSO – perhaps UID provides
some justification at least for some word order
• Findings consistent with a weaker hypothesis
that word order is optimal wrt the frequency
speakers choose to discuss events (not wrt to
how often these events really occur)
• UID may not provide explanation for all of the
word order rankings, but does explain several
aspects of the empirical distribution of word
A Noisy Channel Account of
Crosslinguistic Word Order Variation
• In 96.3% of studied languages S precede O
• SVO (English) and SOV (Japanese) are more
prevalent than VSO
• People construct sentences from and agent
perspective – why SVO/SOV then?
• Innate universal grammar – independent of
communicative or performance factors
• Communicative-based explanation
• SOV default for the human language
– Preference for S to precede O
– Preference for the V to appear in the end of the
• SVO arises from SOV as a result of
communication/memory pressures that
sometimes outweigh the second preference
Shanon’s communication theory
• Comprehension and production operate via a
noisy channel
• Speakers are under constraints to chose
utterances that will ensure maximal meaning
recoverability by the listener
• When does word order affect how easily meaning
can be recovered?
– The girl kicks the ball. (people should adhere to SOV)
– The girl kicks the boy.
(potential confusion resolved perhaps by the position of
the noun wrt to the verb)
• Study investigates whether gestured word order
across languages (English-SVO, Japanese, KoreanSOV) is depending on semantic reversibility of the
– Initial bias to SOV
– Initial bias to native language
– Communicative or memory pressures
• English
– Shift to SVO (second and third factors)
• Japanese&Korean
– Shift to SVO (only due to the third factor)
• Brief silent animations of
intransitive/transitive events
– First verbally described the animations
– Then hand-gestured the meanings of the events
• Verbal and gesture responses were coded for
the relative position of the agent, action, and
Experiment 1
• Animate/inanimate patients (reversible or
non-reversible sentences)
• More SVO word orders should be produced if
• Results – uniformly SVO for verbal responses
– Gestured S before O for animate patients
– Gestured V before O for human patients (as
– Overwhelmingly gestured SOV for non-reversible
Experiment 1&2 – Japanese/Korean
• English participants’ results can be explained
without resorting to noisy-channel hypothesis
– Participants may shift from SOV to native (SVO)
due to increased ambiguity in reversible events
• Thus, tested participants with a SOV native
– Expected shift to SVO in reversible events
• Experiment 2 – used more complex structures
The old woman says that the fireman kicks the girl
Experiment 1&2 – Japanese/Korean
• If participants use native word-order (SOV)
– Then they should gesture both levels of
embedded events with the same order:
S1 [S2O2V2] V1
• In case of reversible events SOV creates
maximal potential confusion
– Then they should gesture using SVO:
S1 V1 [S2V2O2]
Experiment 1&2 – Japanese/Korean
• Exp 1 results – native language word-order
– J&K speakers verbalized patient before action (100%)
– Gestured patient before action in both animate and
inanimate patients
• Exp 2 results – shift to SVO
– J speakers never verbalized SVO; K speakers rarely
– Both J&K speakers almost always gestured top-level verb in
2nd position between the top-level subject and the
embedded subject
– In the embedded clause patients were gestured before the
action almost always, but more often in non-reversible
events (both for J&K speakers)
• Results predicted by noisy-channel but not by the
combination of SOV default and native-language order
Experiment 3
• Alternative explanation of previous results
– Minimizing syntactic dependency distances
– Number of words between a syntactic head (verb)
and its dependents (subject and object)
– Shorter dependencies are easier
• Shift from SOV to SVO given that SVO allows
for shorter dependency distances
Experiment 3 - method
• Animations of a boy and a girl interacting with one of a set
of objects:
– Circle/star/heart which was either
– Spotted/striped (surface); in a box/pail (container);
wearing a top/witch’s hat (headwear)
– Giving/putting/intransitive event
• Participants were to gesture each event and the features of
the object
• If sensitive to distance b/n agent and verb, then higher SVO
gesture order for longer patient descriptions
• No such shift predicted by noisy channel – patient is not a
possible agent of the verb, adding modifiers will not affect
the recoverability of who is doing what to whom
Experiment 3 - results
• Gestured patient before action for most of
• Verbalized action before patient for most of
• Even with long productions still gestured
patient before action, consistently with the
noisy-channel hypothesis and not with the
dependency-distance hypothesis
• English speakers have a strong SOV preference for nonreversible events even when the inanimate patient has
up to 3 features to be gestured
• SOV seems to be the preferred word order in human
• For reversible events the preference for SOV
disappears in favor of SVO
• Although SOV-natives gesture SOV in simple events,
they revert to SVO for more complex ones
• This shift to SVO occurs in order to maximize meaning
• Case marking is often used in SOV
– Mitigates the confusability of subject and object, helping to
retain the default SOV
• If no case marking is used, then SVO shift
• Large majority of SOV languages are case marked, whereas few of
SVO are
• Used location in space as possible case marking in the experiments
– Of the case-marked gestures most had SOV order
• Animacy-dependent case marking
– Many languages mark only animate direct objects
• Non SVO languages have more word-order flexibility than SVO
– Contain other mechanisms for disambiguation
– So fixed word orders mostly SVO
• No need for sophisticated innate machinery to
explain word-order variation
• Many aspects of crosslinguistic word-order
variance are easily explained by
communicative or memory pressures

similar documents