Sachs_DwC_as_a_Model

Report
Using Darwin Core as a Model:
An Ontologically Minimalist Approach to
Publishing Occurrence Data in RDF
Joel Sachs
Formal Models track of the Semantics for
Biodiversity Symposium
TDWG 2013
The first thing I want to communicate:
Semantics != Ontologies
Semantics = Ontologies ?
• Semantics
– Semiotics
– Linguistics
– Psychology
• Ontology
– Philosophy
– Computer Science
Ontologies as a vehicle for semantics
• Ontologies were the first choice for putting the “semantic” in
semantic web.
•
But ontologies aren’t the only way to supply semantics.
• Furthermore, ontologies can be a barrier to shared semantics,
in a number of ways.
What’s green?
• Def 1:
What’s green?
• Def 2: Green is the portion of the electromagnetic spectrum
with a wavelength between 520 – 570 nm.
What’s electromagnetic?
What’s a spectrum?
What’s a wavelength?
What’s a nanomemter?
Occurrence
Occurrence_ID
Latitude
Longitude
Scientific Name
Vernacular Name
Occurrence
Occurrence_ID
Location_ID
URI
DateTime
DateTime
IndividualOrganism_ID URI
Location
Location_ID
Latitude
Longitude
Datum
URI
float
float
URI
Identification
Taxon
Taxon_ID
Scientific Name
Vernacular Name
Authorship
Year
etc.
Identification_ID
Individual_ID
Taxon
Identified_by
URI
URI
URI
There are many ways to think about biodiversity data.
Thing #2 that I want to communicate
Darwin Core (as it is) can be used as a light weight “ontology”.
Don’t try this at home
Thing #3
How to minimize the amount of ontology in the Core.
Example: Material Sample
dwctype:MaterialSample (roughly?) corresponds to OBI:Specimen.
<owl:Class rdf:about=http://purl.obolibrary.org/obo/OBI_0100051>
<owl:equivalentClass>
<owl:Class>
<owl:intersectionOf rdf:parseType="Collection">
<rdf:Description rdf:about="http://purl.obolibrary.org/obo/BFO_0000040"/>
<owl:Restriction>
<owl:onProperty rdf:resource="http://purl.obolibrary.org/obo/BFO_0000087"/>
<owl:someValuesFrom rdf:resource="http://purl.obolibrary.org/obo/OBI_0000112"/>
</owl:Restriction>
</owl:intersectionOf>
</owl:Class>
</owl:equivalentClass>
<rdfs:subClassOf rdf:resource="http://purl.obolibrary.org/obo/BFO_0000040"/>
<owl:disjointWith rdf:resource="http://purl.obolibrary.org/obo/BFO_0000141"/>
<n0pred:IAO_0000602>(forall (x) (if (MaterialEntity x) (IndependentContinuant x))) // axiom
label in BFO2 CLIF: [019-002] </n0pred:IAO_0000602>
<n0pred:BFO_0000179>material</n0pred:BFO_0000179>
<n0pred:BFO_0000180>MaterialEntity</n0pred:BFO_0000180>
<n0pred:IAO_0000602>(forall (x) (if (and (Entity x) (exists (y t) (and (MaterialEntity y)
(continuantPartOfAt x y t)))) (MaterialEntity x))) // axiom label in BFO2 CLIF: [021-002]
</n0pred:IAO_0000602>
<n0pred:IAO_0000602>(forall (x) (if (and (Entity x) (exists (y t) (and (MaterialEntity y)
(continuantPartOfAt y x t)))) (MaterialEntity x))) // axiom label in BFO2 CLIF: [020-002]
</n0pred:IAO_0000602>
curl -L -H "Accept: application/rdf+xml" http://rs.tdwg.org/dwc/dwctype/MaterialSample | grep OBI
<rdf:Description rdf:about="http://rs.tdwg.org/dwc/dwctype/MaterialSample">
<rdfs:label xml:lang="en-US">MaterialSample</rdfs:label>
<rdfs:comment xml:lang="en-US">A resource describing the physical results of a sampling (or
subsampling) event. In biological collections, the material sample is typically collected, and either
preserved or destructively processed.</rdfs:comment>
<rdfs:isDefinedBy rdf:resource="http://rs.tdwg.org/dwc/dwctype/"/> <dcterms:issued>2013-0328</dcterms:issued>
<dcterms:modified>2013-09-26</dcterms:modified>
<rdf:type rdf:resource="http://www.w3.org/2000/01/rdf-schema#Class"/>
<dcterms:hasVersion rdf:resource="http://rs.tdwg.org/dwc/dwctype/history/ #MaterialSample-201306-24"/>
<dcam:memberOf rdf:resource="http://rs.tdwg.org/dwc/terms/DwCType"/>
<rdfs:subClassOf rdf:resource="http://purl.obolibrary.org/obo/OBI_0100051"/>
<dwcattributes:status>recommended</dwcattributes:status>
<dwcattributes:decision rdf:resource="http://rs.tdwg.org/dwc/terms/history/decisions/
Decision_2013-10-09_12"/>
<dwcattributes:abcdEquivalence>DataSets/DataSet/Units/Unit</dwcattributes:abcdEquivalence>
</rdf:Description>
curl -L -H "Accept: application/rdf+xml" http://rs.tdwg.org/dwc/dwctype/MaterialSample | grep OBI
<rdf:Description rdf:about="http://rs.tdwg.org/dwc/dwctype/MaterialSample">
<rdfs:subClassOf rdf:resource="http://purl.obolibrary.org/obo/OBI_0100051"/>
</rdf:Description>
On the one hand
• Nobody forces consuming application to ingest the OBI and
BFO ontologies when they ingest Darwin Core.
• So what’s the big deal?
On the other hand
• Many semantic web clients automatically fetch and load
referenced documents.
– Especially if the documents are referenced with important
properties like rdfs:subClassOf
• It’s bad form (and slightly dangerous) to clutter a semantic
web document with terms from unnecessary namespaces.
My suggestion?
• Assertions that tie Core terms to upper ontologies should be asserted in a
separate document.
E.g.
<rdf:Description rdf:about="http://rs.tdwg.org/dwc/dwctype/MaterialSample">
<rdfs:subClassOf rdf:resource="http://purl.obolibrary.org/obo/OBI_0100051"/>
</rdf:Description>
should be asserted in obi.owl, or dwc_obi.owl
• That way, those doing integration that depends on OBI axioms can ingest
the appropriate descriptions.
• Those that don’t need the OBI axioms don’t have to worry about incorrect
inference.
– Keep in mind: There is no preferred upper ontology for science on the
semantic web.
• BFO, Dolce, SUMO, UMBEL, NULO, etc.
Thank you for paying attention!
Question, comments, and criticism to
@xjsachs
[email protected]

similar documents