Slide - Max Planck Institute for Informatics

Report
Semantic Search:
from Names and Phrases
to Entities and Relations
Gerhard Weikum
Max Planck Institute for Informatics
& Saarland University
http://www.mpi-inf.mpg.de/~weikum/
Acknowledgements
Big Picture: Opportunities Now !
Very Large Knowledge Bases
Entity Linkage
Web of Data
Web of Users & Contents
KB Population
Disambiguation
Semantic
Info Extraction
Docs
Semantic Authoring
Big Picture: Opportunities Now !
Very Large Knowledge Bases
Entity Linkage
This talk:
How Do We Search this World of
Disambiguation
Knowledge,
Data, and Text
(and cope with ambiguity)
for Knowledge Harvesting
see talks
at College deDocs
France
Semantic
and at VLDB School in Kunming
Info Extraction
Semantic Authoring
Web of Data
Web of Users & Contents
KB Population
Web of Data: RDF, Tables, Microdata
30 Bio. SPO triples (RDF) and growing
SUMO
ReadTheWeb
Cyc
YAGO
TextRunner/
ReVerb
ConceptNet 5
BabelNet
WikiTaxonomy/
WikiNet
http://richard.cyganiak.de/2007/10/lod/lod-datasets_2011-09-19_colored.png
Web of Data: RDF, Tables, Microdata
30 Bio. SPO triples (RDF) and growing
• 4M entities in
250 classes
• 500M facts for
6000 properties
• live updates
• 10M entities in
350K classes
• 120M facts for
100 relations
• 100 languages
• 25M entities in
• 95% accuracy
YAGO
2000 topics
Ennio_Morricone type composer
• 100M facts for
Ennio_Morricone type GrammyAwardWinner
4000 properties
composer subclassOf musician
• powers Google
Ennio_Morricone bornIn Rome
knowledge graph
Rome locatedIn Italy
Ennio_Morricone created Ecstasy_of_Gold
Ennio_Morricone wroteMusicFor The_Good,_the_Bad_,and_the_Ugly
Sergio_Leone directed The_Good,_the_Bad_,and_the_Ugly
http://richard.cyganiak.de/2007/10/lod/lod-datasets_2011-09-19_colored.png
Linked RDF Triples on the Web
yago/wordnet: Artist109812338
yago/wordnet:Actor109765278
yago/wikicategory:ItalianComposer
imdb.com/name/nm0910607/
dbpedia.org/resource/Ennio_Morricone
imdb.com/title/tt0361748/
dbpedia.org/resource/Rome
rdf.freebase.com/ns/en.rome
data.nytimes.com/51688803696189142301
geonames.org/3169070/roma
500 Mio. links
N 41° 54' 10'' E 12° 29' 2''
Embedding (RDF) Microdata in HTML Pages
<html … May 2, 2011
Supported by RDFa
and microformats
like schema.org
May 2, 2011
Maestro Morricone will perform
on the stage of the Smetana Hall
to conduct the Czech National
Symphony Orchestra and Choir.
The concert will feature both
Classical compositions and
soundtracks such as
the Ecstasy of Gold.
In programme two concerts for
July 14th and 15th.
<div typeof=event:music>
<span id="Maestro_Morricone">
Maestro Morricone
<a rel="sameAs"
resource="dbpedia/Ennio_Morricone "/>
</span>
…
<span property = "event:location" >
Smetana Hall </span>
…
<span property="rdf:type"
resource="yago:performance">
The concert </span> will feature
…
<span property="event:date"
content="14-07-2011"></span>
July 1
</div>
Outline

Opportunities Now
Semantic Search Today
Entity Name Disambiguation
Question Answering
Disambiguation Reloaded
Wrap-Up
Semantic Search Today (1)
Semantic Search Today (1)
Semantic Search Today (1)
Semantic Search Today (1)
Semantic Search Today (1)
Semantic Search Today (2)
Select ?x Where {
?x type composer [western movie] .
?x wasBornIn ?y . ?y locatedIn Europe . }
Semantic Search Today (2)
Select ?x Where {
?x type composer .
?x participatedIn ?y .
?y type western_film . }
Semantic Search Today (3)
Semantic Search Today (3)
Semantic Search Today (3)
Semantic Search Today (4)
Semantic Search Today (4)
Key problem in
semantic search:
diversity and ambiguity
of names and phrases !
Outline

Opportunities Now

Semantic Search Today
Entity Name Disambiguation
Question Answering
Disambiguation Reloaded
Wrap-Up
Three Different NLP Problems
Harry fought with you know who. He defeats the dark lord.
Dirty
Harry
Harry
Potter
Prince Harry
of England
The Who
(band)
Lord
Voldemort
Three NLP tasks:
1) named-entity detection: segment & label by HMM or CRF
(e.g. Stanford NER tagger)
2) co-reference resolution: link to preceding NP
(trained classifier over linguistic features)
3) named-entity disambiguation:
map each mention (name) to canonical entity (entry in KB)
3-23
Named Entity Disambiguation
Eli (bible)
Sergio talked to
Ennio about
Eli‘s role in the
Ecstasy scene.
This sequence on
the graveyard
was a highlight in
Sergio‘s trilogy
of western films.
Mentions
(surface names)
Eli Wallach
?
Benny Goodman
Ecstasy
(drug)
Ecstasy
of Gold
Benny Andersson
Star Wars Trilogy
KB
Sergio means Sergio_Leone
Sergio means Serge_Gainsbourg
Ennio means Ennio_Antonelli
Ennio means Ennio_Morricone
Eli means Eli_(bible)
Eli means ExtremeLightInfrastructure
Eli means Eli_Wallach
Ecstasy means Ecstasy_(drug)
Ecstasy means Ecstasy_of_Gold
trilogy means Star_Wars_Trilogy
trilogy means Lord_of_the_Rings
trilogy means Dollars_Trilogy
Lord of the Rings
Dollars Trilogy
Entities
(meanings)
3-24
Mention-Entity Graph
weighted undirected graph with two types of nodes
bag-of-words or
Sergio talked to language model: Eli (bible)
words, bigrams,
Ennio about
Eli Wallach
phrases
Eli‘s role in the
Ecstasy (drug)
Ecstasy scene.
This sequence on
Ecstasy of Gold
the graveyard
Star Wars
was a highlight in
Sergio‘s trilogy
Lord of the Rings
of western films.
Dollars Trilogy
Popularity
(m,e):
Similarity
(m,e):
• freq(e|m)
• length(e)
• #links(e)
• cos/Dice/KL
(context(m),
context(e))
KB+Stats
3-25
Mention-Entity Graph
weighted undirected graph with two types of nodes
Sergio talked to
Ennio about
Eli‘s role in the
Ecstasy scene.
This sequence on
the graveyard
was a highlight in
Sergio‘s trilogy
of western films.
Eli (bible)
Eli Wallach
joint
mapping
Popularity
(m,e):
Similarity
(m,e):
• freq(e|m)
• length(e)
• #links(e)
• cos/Dice/KL
(context(m),
context(e))
Ecstasy (drug)
Ecstasy of Gold
Star Wars
Lord of the Rings
Dollars Trilogy
KB+Stats
3-26
Mention-Entity Graph
weighted undirected graph with two types of nodes
Sergio talked to
Ennio about
Eli‘s role in the
Ecstasy scene.
This sequence on
the graveyard
was a highlight in
Sergio‘s trilogy
of western films.
Eli (bible)
Eli Wallach
Ecstasy(drug)
Ecstasy of Gold
Star Wars
Lord of the Rings
Popularity
(m,e):
Similarity
(m,e):
• freq(m,e|m)
• length(e)
• #links(e)
• cos/Dice/KL
(context(m),
context(e))
Dollars Trilogy
KB+Stats
Coherence
(e,e‘):
• dist(types)
• overlap(links)
• overlap
27
/ 20
3-27
(anchor words)
Mention-Entity Graph
weighted undirected graph with two types of nodes
American Jews
Eli (bible)
film actors
Sergio talked to
artists
Ennio about
Academy Award winners
Eli Wallach
Eli‘s role in the
Metallica songs
Ecstasy (drug)
Ecstasy scene.
Ennio Morricone songs
This sequence on
artifacts
Ecstasy of Gold
soundtrack music
the graveyard
Star Wars
was a highlight in
spaghetti westerns
Sergio‘s trilogy
Lord of the Rings
film trilogies
movies
of western films.
artifacts
Dollars Trilogy
Popularity
(m,e):
Similarity
(m,e):
• freq(m,e|m)
• length(e)
• #links(e)
• cos/Dice/KL
(context(m),
context(e))
KB+Stats
Coherence
(e,e‘):
• dist(types)
• overlap(links)
• overlap
28
/ 20
3-28
(anchor words)
Mention-Entity Graph
weighted undirected graph with two types of nodes
http://.../wiki/Dollars_Trilogy
Eli (bible)
http://.../wiki/The_Good,_the_Bad, _
Sergio talked to
http://.../wiki/Clint_Eastwood
Ennio about
http://.../wiki/Honorary_Academy_A
Eli Wallach
Eli‘s role in the
Ecstasy (drug)
http://.../wiki/The_Good,_the_Bad,_t
Ecstasy scene.
http://.../wiki/Metallica
This sequence on
Ecstasy of Gold
http://.../wiki/Bellagio_(casino)
http://.../wiki/Ennio_Morricone
the graveyard
Star Wars
was a highlight in
http://.../wiki/Sergio_Leone
Sergio‘s trilogy
Lord of the Rings
http://.../wiki/The_Good,_the_Bad,_
http://.../wiki/For_a_Few_Dollars_M
of western films.
http://.../wiki/Ennio_Morricone
Dollars Trilogy
Popularity
(m,e):
Similarity
(m,e):
• freq(m,e|m)
• length(e)
• #links(e)
• cos/Dice/KL
(context(m),
context(e))
KB+Stats
Coherence
(e,e‘):
• dist(types)
• overlap(links)
• overlap
29
/ 20
3-29
(anchor words)
Mention-Entity Graph
weighted undirected graph with two types of nodes
The Magnificent Seven
Eli (bible)
The Good, the Bad, and the Ugly
Sergio talked to
Clint Eastwood
Ennio about
University of Texas at Austin
Eli Wallach
Eli‘s role in the
Metallica on Morricone tribute
Ecstasy (drug)
Ecstasy scene.
Bellagio water fountain show
This sequence on
Yo-Yo Ma
Ecstasy of Gold
Ennio Morricone composition
the graveyard
Star Wars
was a highlight in
For a Few Dollars More
Sergio‘s trilogy
Lord of the Rings
The Good, the Bad, and the Ugly
Man with No Name trilogy
of western films.
soundtrack by Ennio Morricone
Dollars Trilogy
Popularity
(m,e):
Similarity
(m,e):
• freq(m,e|m)
• length(e)
• #links(e)
• cos/Dice/KL
(context(m),
context(e))
KB+Stats
Coherence
(e,e‘):
• dist(types)
• overlap(links)
• overlap
30
/ 20
3-30
(anchor words)
Joint Mapping
50
30
20
30
50
10
10
90
100
30
90
100
5
20
80
30
90
• Build mention-entity graph or joint-inference factor graph
from knowledge and statistics in KB
• Compute high-likelihood mapping (ML or MAP) or
dense subgraph such that:
each m is connected to exactly one e (or at most one e)
3-31
Coherence Graph Algorithm
[J. Hoffart et al.: EMNLP‘11]
50
30
20
30
100
30
90
100
5
140
180
50
50
10
10
90
470
20
80
145
30
90
230
• Compute dense subgraph to
maximize min weighted degree among entity nodes
such that:
each m is connected to exactly one e (or at most one e)
• Greedy approximation:
iteratively remove weakest entity and its edges
• Keep alternative solutions, then use local/randomized search
3-32
Mention-Entity Popularity Weights
[Milne/Witten 2008, Spitkovsky/Chang 2012]
• Need dictionary with entities‘ names:
• full names: Arnold Alois Schwarzenegger, Los Angeles, Microsoft Corp.
• short names: Arnold, Arnie, Mr. Schwarzenegger, New York, Microsoft, …
• nicknames & aliases: Terminator, City of Angels, Evil Empire, …
• acronyms: LA, UCLA, MS, MSFT
• role names: the Austrian action hero, Californian governor, CEO of MS, …
…
plus gender info (useful for resolving pronouns in context):
Bill and Melinda met at MS. They fell in love and he kissed her.
• Collect hyperlink anchor-text / link-target pairs from
• Wikipedia redirects
• Wikipedia links between articles
• Interwiki links between Wikipedia editions
• Web links pointing to Wikipedia articles
…
• Build statistics to estimate P[entity | name]
3-33
Mention-Entity Similarity Edges
Precompute characteristic keyphrases q for each entity e:
anchor texts or noun phrases in e page with high PMI:
weight ( q , e )  log
freq ( q , e )
„Metallica tribute to Ennio Morricone“
freq ( q ) freq ( e )
Match keyphrase q of candidate e in context of mention m

# matching words 
score ( q | e ) ~
length of cover(q) 

 w cover(q) weight ( w | e ) 
 w q weight(w | e) 
Extent of partial matches
1 
Weight of matched words
The Ecstasy piece was covered by Metallica on the Morricone tribute album.
Compute overall similarity of context(m) and candidate e
score ( e | m ) ~

score ( q ) dist ( cover(q) , m )

q  keyphrases ( e )
in context ( m )
3-34
Entity-Entity Coherence Edges
Precompute overlap of incoming links for entities e1 and e2
mw - coh(e1, e2) ~ 1 
log max( in ( e1, e 2 ))  log( in ( e1)  in ( e 2 ))
log | E |  log min( in ( e1), in ( e 2 ))
Alternatively compute overlap of anchor texts for e1 and e2
ngram - coh(e1, e2) ~
ngrams ( e1)  ngrams ( e 2 )
ngrams ( e1)  ngrams ( e 2 )
or overlap of keyphrases, or similarity of bag-of-words, or …
Optionally combine with type distance of e1 and e2
(e.g., Jaccard index for type instances)
For special types of e1 and e2 (locations, people, etc.)
use spatial or temporal distance
3-35
AIDA: Accurate Online Disambiguation
http://www.mpi-inf.mpg.de/yago-naga/aida/
3-36
AIDA: Accurate Online Disambiguation
http://www.mpi-inf.mpg.de/yago-naga/aida/
3-37
AIDA: Very Difficult Example
http://www.mpi-inf.mpg.de/yago-naga/aida/
3-38
AIDA: Very Difficult Example
http://www.mpi-inf.mpg.de/yago-naga/aida/
3-39
AIDA: Accurate Online Disambiguation
http://www.mpi-inf.mpg.de/yago-naga/aida/
3-40
AIDA: Accurate Online Disambiguation
http://www.mpi-inf.mpg.de/yago-naga/aida/
3-41
Some NED Online Tools for
J. Hoffart et al.: EMNLP 2011, VLDB 2011
https://d5gate.ag5.mpi-sb.mpg.de/webaida/
P. Ferragina, U. Scaella: CIKM 2010
http://tagme.di.unipi.it/
R. Isele, C. Bizer: VLDB 2012
http://spotlight.dbpedia.org/demo/index.html
Reuters Open Calais
http://viewer.opencalais.com/
S. Kulkarni, A. Singh, G. Ramakrishnan, S. Chakrabarti: KDD 2009
http://www.cse.iitb.ac.in/soumen/doc/CSAW/
D. Milne, I. Witten: CIKM 2008
http://wikipedia-miner.cms.waikato.ac.nz/demos/annotate/
perhaps more
some use Stanford NER tagger for detecting mentions
http://nlp.stanford.edu/software/CRF-NER.shtml
3-42
NED: Experimental Evaluation
Benchmark:
• Extended CoNLL 2003 dataset: 1400 newswire articles
• originally annotated with mention markup (NER),
now with NED mappings to Yago and Freebase
• difficult texts:
… Australia beats India …
… White House talks to Kreml …
… EDS made a contract with …
 Australian_Cricket_Team
 President_of_the_USA
 HP_Enterprise_Services
Results:
Best: AIDA method with prior+sim+coh + robustness test
82% precision @100% recall, 87% mean average precision
Comparison to other methods, see paper
J. Hoffart et al.: Robust Disambiguation of Named Entities in Text, EMNLP 2011
http://www.mpi-inf.mpg.de/yago-naga/aida/
3-43
Ongoing Research & Remaining Challenges
• More efficient graph algorithms (multicore, etc.)
• Allow mentions of unknown entities, mapped to null
• Leverage deep-parsing structures,
leverage semantic types
Example: Page played Kashmir on his Gibson
subj
obj
mod
• Short and difficult texts:
• tweets, headlines, etc.
• fictional texts: novels, song lyrics, etc.
• incoherent texts
• Structured Web data: tables and lists
• Disambiguation beyond entity names:
• coreferences: pronouns, paraphrases, etc.
• common nouns, verbal phrases (general WSD)
3-44
Variants of NED at Web Scale
Tools can map short text onto entities in a few seconds
• How to run this on big batch of 1 Mio. input texts?
 partition inputs across distributed machines,
organize dictionary appropriately, …
 exploit cross-document contexts
• How to handle Web-scale inputs (100 Mio. pages)
restricted to a set of interesting entities?
(e.g. tracking politicians and companies)
3-45
Outline

Opportunities Now

Semantic Search Today

Entity Name Disambiguation
Question Answering
Disambiguation Reloaded
Wrap-Up
Deep Question Answering
William Wilkinson's "An Account of the
Principalities of Wallachia and Moldavia"
inspired this author's most famous novel
This town is known as "Sin City" & its
downtown is "Glitter Gulch"
As of 2010, this is the only
former Yugoslav republic in the EU
99 cents got me a 4-pack of Ytterlig
coasters from this Swedish chain
question
classification &
decomposition
knowledge
back-ends
YAGO
D. Ferrucci et al.: Building Watson. AI Magazine, Fall 2010.
IBM Journal of R&D 56(3/4), 2012: This is Watson.
Semantic Keyword Search
[Ilyas et al. Sigmod‘10]
Need to map (groups of) keywords onto entities & relationships
based on name-entity similarities/probabilities
q: composer Rome scores westerns
composer
(creator
of music)
Media
Composer
video editor
western movies
film
music
Rome
(Italy)
Rome
(NY)
Lazio
Roma
… born in …
… plays for …
western world
goal in
football
AS
Roma
Western Digital
Western (airline)
Western (NY)
… used in …
… recorded at …
Natural Language Questions are Natural
translate question into Sparql query:
• dependency parsing to decompose question
• mapping of question units onto entities, classes, relations
Who composed
scores for westerns
and is from Rome?
Who composed scores for westerns and is from Rome?
map results
into tabular
or visual
presentation
or speech
From Questions to Queries
Dependency parsing exposes structure of question
 „triploids“ (sub-cues)
NL question:
Who composed scores for westerns and is from Rome?
Who composed scores
scores
is from Rome
for westerns
2-50
From Triploids to Triples
Who composed scores for westerns and is from Rome?
Who composed scores
?x composed ?s
scores
?x type composer
?s type music
scores
for westerns
scores
contributesTo
?s
contributesTo
?y ?y
?y type westernMovie
Who is from Rome
?x bornIn Rome
2-51
Pattern Dictionary for Relations
[N. Nakashole et al.: EMNLP 2012]
Problem: cope with language diversity & ambiguity
Example: composed …, wrote …, created …, …
WordNet-style dictionary/taxonomy for relational phrases
based on SOL patterns (syntactic-lexical-ontological)
• Relational phrases can be synonymous
“graduated from”  “obtained degree in * from”
“and $PRP ADJ advisor”  “under the supervision of”
• One relational phrase can subsume another
“wife of”  “ spouse of”
• Relational phrases are typed
<person> graduated from <university>
<singer> released <album>
<singer> covered <song>
<book> covered <event>
PATTY: Pattern Taxonomy for Relations
[N. Nakashole et al.: EMNLP 2012, demo at VLDB 2012]
350 000 SOL patterns with 4 Mio. instances
Derived from large data (Wikipedia, NYT, ClueWeb)
by scalable sequence mining
accessible at: www.mpi-inf.mpg.de/yago-naga/patty
Who composed scores for westerns and is from Rome?
c: person
Who
c: musician
e: WHO
composed
r: created
q1
r: wroteComposition
composed
r: wroteSoftware
scores
q2
c:soundtrack
scores for
r: soundtrackFor
r: shootsGoalFor
q3
c: western movie
westerns
e: Western Digital
r: actedIn
q4
is from
r: bornIn
e: Rome (Italy)
Rome
e: Lazio Roma
weighted edges (coherence, similarity, etc.)
Disambiguation Mapping for Triploids
Combinatorial Optimization by ILP (with type constraints etc.)
Relaxing Overconstrained Queries
Select ?p Where {
?p composed ?s . ?s type music
?s for ?m . ?m type movie .
?p bornIn Rome . }
.
Select ?p Where {
?p composed ?s . ?s type music .
?s for ?m . ?m type movie [western] .
?p bornIn Rome . }
Select ?p Where {
?p ?rel1 ?s [composed] . ?s type music .
?s ?rel2 ?m . ?m type movie [western] .
?p bornIn Rome . }
with extended SPARQL-FullText: SPOX quad patterns
(S. Elbassuoni et al.: CIKM‘10, ESWC’11, SIGIR‘12)
Preliminary Results
(M. Yahya et al.:
WWW‘12, EMNLP‘12)
http://www.mpi-inf.mpg.de/yago-naga/deanna/
Outline

Opportunities Now

Semantic Search Today


Entity Name Disambiguation
Question Answering
Disambiguation Reloaded
Wrap-Up
Who composed
scores for westerns
and is from Rome?
q1
Selection: Xi
Who
composed
composed
scores
q2
scores for
q3
westerns
q4
is from
[M.Yahya et al.: EMNLP‘12]
Assignment: Yij
c:person
c:musician
e:WHO
r:created
r:wroteComposition
r:wroteSoftware
c:soundtrack
r:soundtrackFor
r:shootsGoalFor
c:western movie
e:Western Digital
r:actedIn
r:bornIn
e:Rome (Italy)
Rome
Joint
Mapping:
Zkl
e:Lazio Roma
weighted edges (coherence, similarity, etc.)
Disambiguation Mapping
Disambig. Mapping: Objective Function
q1
Selection: Xi
Assignment: Yij
Who
composed
composed
scores
wij
c:person
c:musician
e:WHO
Joint
Mapping:
Zkl
vkl
r:created
r:wroteComposition
r:wroteSoftware
q2
c:soundtrack
scores for
r:soundtrackFor
r:shootsGoalFor
q3
c:western movie
westerns
e:Western Digital
maximize
 i,j wij Yij +  k,l vklr:actedIn
Zkl +… subject to:
q4
1) Yij  Xi for allisi,jfrom
r:bornIn
2) j Yij  1 for all i
e:Rome (Italy)
3) Zkl  i,j Yik and
Zkl  j Yil for all k,l
Rome
e:Lazio Roma
4) Xi,Yij,Zkl  {0,1}
weighted edges (coherence, similarity, etc.)
Who composed
scores for westerns
and is from Rome?
Disambig. Mapping: Constraints
Selection: Qhi
q1
Selection: Xi
Assignment: Yij
Who
composed
composed
scores
wij
c:person
c:musician
e:WHO
Joint
Mapping:
Zkl
vkl
r:created
r:wroteComposition
r:wroteSoftware
q2
c:soundtrack
scores for
r:soundtrackFor
r:shootsGoalFor
q3
c:western movie
westerns
e:Western Digital
r:actedIn
q4
maximize  i,j wisij from
Yij +  k,l vklr:bornIn
Zkl +… subject to:
5) Qhi = 1  g Qhg = 3 for all h,i e:Rome (Italy)
6) Xi + Xg  1 for
all mutually exclusive i,g
Rome
e:Lazio Roma
7) Qhi = 1  g,j Qhg Ygj = 1 for relation nodes j
weighted edges (coherence, similarity, etc.)
Who composed
scores for westerns
and is from Rome?
Disambig. Mapping: Type Constraints
Selection: Qhi
q1
ILP optimizers
like Gurobi
q2
solve this in
1 or 2 seconds
Selection: Xi
Who
composed
composed
scores
scores for
q3
westerns
q4
Assignment: Yij
is from
wij
c:person
c:musician
e: WHO
Joint
Mapping:
Zkl
vkl
r:created
r:wroteComposition
r:wroteSoftware
c:soundtrack
r:soundtrackFor
r:shootsGoalFor
c:western movie
e:Western Digital
r:actedIn
r:bornIn
maximize  i,j wij Yij +  k,l vkle:Rome
Zkl +…
subject to:
(Italy)
8) Yij = 1 and j is
relation node and Zkj=1 and Zjl=1
Rome
e:Lazio Roma
 domain(j)  types(k) and range(j)  types(l)
weighted edges (coherence, similarity, etc.)
Who composed
scores for westerns
and is from Rome?
Outline

Opportunities Now

Semantic Search Today

Entity Name Disambiguation

Question Answering

Disambiguation Reloaded
Wrap-Up
Summary
• Web of Data & Knowledge & Text (RDF + Phrases)
Calls for Semantic Search by Entities, Classes & Relations
• Diversity & Ambiguity of Names and Phrases
Calls for Disambiguation Mapping
• Strong Story for Entity Name Disambiguation
• Ongoing Work on Relation Phrase Disambiguation
• Cornerstone of Question Answering with
Natural Language or Advanced Keywords
Great opportunity towards next-generation search
Challenging problems: robustness, scale, dynamics & transfer
Take-Home Message
Solve „Who composed the Ecstasy
and other pieces for westerns?“
 can solve semantic search
with natural-language disambiguation

similar documents