SALUS Initial Prototype Presentation

Report
A demo of an initial prototype of
project idea
Mustafa Yuksel & Gokce B. Laleci, SRDC
Motivation

Currently, the clinical research and the clinical care domains are quite
disconnected because each use different standards and terminology
systems.




In contrast to CDISC standards used in the clinical research domain, in the
clinical care domain, the most widely used content and messaging standards are
by HL7
The terminology systems are quite different as well: While MedDRA, WHODD
and CDISC Terminology are commonly used in the clinical research domain; the
prominent terminology systems in clinical care domain include SNOMED CT,
LOINC, ICD-9 and ICD-10
The available integration efforts are mostly proprietary, custom developed
for a specific use-case and depend on hard-coded n x n mappings among
standards
For example, the Electronic Data Capture (EDC) systems are usually not
connected to the EHR systems that are used by the health care providers.

The clinicians have to manually copy the results of therapeutic procedures and
examinations from an EHR system into the Case Report Form (CRF), which
causes errors and work disruption as well as delays in reporting data.
April 17-18, 2012
SALUS Technical Kickoff Meeting
2
Visionary Scenario



A new Calcium Channel Blocker is marketed after a successful clinical trial period
The regulatory body receives Adverse Event reports indicating that this new drug
causes swelling of legs
The regulatory body decides to conduct a more extended post market safety study
(or asks the Pharmaceutical Company to do so)

Prepares the Study Protocol in CDISC SDM



Eligibility criteria: Patients who have recently been treated with this new Calcium Channel Blocker
Collect all of the other symptoms, diagnoses, allergies, medications of this patient in the first visit
This protocol definition is sent to the health care providers that are in SALUS cslinical
research community




Patient history documents conforming to the protocol definition, and in different schemas such as
HL7 CDA and CEN EN 13606 are sent by the hospitals to the regulatory body
This patient histories in CDA and 13606 are translated to BRIDG Model
Form Manager processes the Study Design, identifies the items requested in CRF Forms from their
annotation in CDASH
The patient history in BRIDG is queried through the predefined queries defined for each CDASH
variable (they can be used for semi-autmatically filling in CRF forms)
April 17-18, 2012
SALUS Technical Kickoff Meeting
3
Visionary Scenario (continued)

After collecting significant data from some patients, the regulatory
body prepares the statistical analysis data by semantically querying
the collected data represented in BRIDG Model

Number of patients who have experienced edema in legs (represented
through MedDRA term 10014239) have also





Condition of heart failure (represented through MedDRA term 10019279)
Condition of primary pulmonary hypertension (represented through MedDRA
term 10036727)
Has already been treated through a vasodilating agent (represented through
SNOMEDCT term 58944007)
Participating health care providers code observations through ICD-9,
SNOMED CT terms, record adverse events through Who-Art, and
record the medications provided through RxNorm
After the analysis, it has been clarified that the adverse event
incidents are mostly related with the underlying condition or current
treatment of the patients…
April 17-18, 2012
SALUS Technical Kickoff Meeting
4
Exploiting the Initial SALUS
Semantic Framework

We have envisioned two use cases to
1.
automatically fill in eCRFs
2.
facilitate safety studies on EHR systems
April 17-18, 2012
SALUS Technical Kickoff Meeting
5
The Components of the initial
Demo
the BRIDG DAM ontology expressed in RDF as the core ontology hosted in a
knowledge base
i.

ii.
iii.
iv.
v.
vi.
We have developed the RDF representation of the BRIDG DAM v3.0.3 to be used as
the core ontology to make the common shared semantics available in a formal,
machine processable form.
tools for semantic lifting of the content standards harmonized by the BRIDG
initiative including HL7 RIM based models, CDISC ODM based models and for
aligning these semantic models with the core ontology in the knowledge base
tools for importing semantic representations of the terminology systems and
biomedical ontologies as well as aligning these models with the core ontology
tools to import clinical documents/messages to the SALUS knowledge base by
automatically translating them to the instances of the core ontology
a library of SPARQL queries to retrieve clinical data corresponding to the
CDASH data sets from the knowledge base
tools for semantically mediating the documents/messages represented in
different clinical research and care standards to one another
April 17-18, 2012
SALUS Technical Kickoff Meeting
6
April 17-18, 2012
SALUS Technical Kickoff Meeting
7
(A) BRIDG DAM as the common
“model of meaning”

The BRIDG DAM is an implementation independent UML model to
represent common shared semantics of regulated clinical research studies
which may have different implementations

In 2003, CDISC, and HL7 signed a 2-year-old Memorandum of Understanding
(MoU) to work collaboratively on the data exchange standards in domains that
are of interest to both organizations and to create a Domain Analysis Model
(DAM) as an implementation independent model of the shared semantics


A reverse engineering effort to create the DAM is initiated



Protocol Representation, Study Conduct, Adverse Event, Regulatory and Common
Implementation independent UML Model


From the already existing HL7 RCRIM messages
From the CDISC CDASH, SDTM Data sets and ODM Models
BRID DAM is composed of five sub-domains:


Later NCI through their CaBIG Project, and FDA joined the group
CDISC SDM and HL7 Study Design RMIM are both implementations of Protocol
Representation Sub Domain
Hence, it is the best alternative to be the starting point for core of our
Semantic Framework
April 17-18, 2012
SALUS Technical Kickoff Meeting
8
Sample UML Model from BRIDG
Study Conduct Sub Domain
class View CM: Comm...
Legend
View Description:
The Common sub-domain represents the semantics that are common to all (or
most) of the other sub-domains. For example, it includes semantics for such
things as people, organizations, places and materials.
Adverse Event Sub-Domain
Common Sub-Domain
Protocol Representation Sub-Domain
Regulatory Sub-Domain
Produc t
Study Conduct Sub-Domain
is a function
performed by
0..* +
+
{has}
1
Ex perimentalUnit
BiologicEntityPart
is part of
BiologicEntity
+
+
+
+
+
+
+
+
name: DSET<EN>
administrativeGenderCode: CD
birthCountryCode: CD
0..1
birthOrder: INT.POS
birthDate: TS.DATETIME
deathDate: TS.DATETIME
deathIndicator: BL
actualIndicator: BL
1.. *
{functions as}
0..*
is a function performed by
{functions as}
+performing
0..*
0..*
0..1
typeCode: DSET<CD>
{scopes}
+
+
+
0..*
{participates in}
0..1
0..*
BiologicEntityIdentifier
+
+
+
+
0..1
is assigned by
{assigns}
0..*
0..1
+
name: ST
0..*
is a version of
{has as a
version}
1.. *
constraints
{Is a Function Performed By
Qualifier}
{Study Author Performed By
Qualifier}
{Is a Function Performed By
Exclusive Or}
{functions as}
1
ResearchStaff
performs
identifier: II
jobTitle: ST
postalAddress: AD
telecomAddress: BAG<TEL>
effectiveDateRange: IVL<TS.DATETIME>
DocumentV ersion
is a function
performed by
{functions as}
0..1
0..*
officialTitle: ST
text: ED
keywordCode: DSET<CD>
keywordText: DSET<ST>
numberText: ST.SIMPLE
revisionReason: ST
uniformResourceLocator: URL
bibliographicDesignation: ST
date: TS.DATETIME
has as t arget
+target
{functions as}
{functions as}
0..*
has as source
+source
1
0..1
1
{functions as}
Dev ice
ReportV ersion
{functions as}
0..*
{is staffed by}
is a function is a function performed by
performed by
{functions as}
{functions as}
effectiveDateRange:
IVL<TS.DATETIME>
0..* +
+
+
+
+
belongs t o
{contains}
1
+
effectiveDateRange:
IVL<TS.DATETIME>
0..1
1
ResearchOrganization
+
+
Performe r
{functions as}
constraints
{Is a Function Performed By
Exclusive Or}
st aff s
0..*
{is staffed by}
typeCode: CD
effectiveDateRange:
IVL<TS.DATETIME>
is assigned
by {assigns}
is a function performed by
1
0..1
receives
{is received by}
is assigned by
Administrativ eMemberCRA
{assigns}
1
is a function performed by
0..*
0..*
ReportReceiv er
{is staffed by}
is a function performed by
st aff s
{functions as}
{is staffed by}
{assigns}
1
+
+
+
+
{functions as}
is a function
performed by
0..*
0..*
0..*
{functions as}
constraints
{Is a Member Of
Exclusive Or}
is a
member of
1.. *
ResourceProv ider
0..1
0..1
0..*
1.. *
1.. *
{produces}
1.. *
{has as an outlet }
0..*
0..*
ProcessedProduc t
produces
Reprocessor
belongs t o
department at
1.. *
{is the
department
for}
OrganizationalContac t
+
+
+
+
+
0..1
is a function
performed by
+
title: ST
typeCode: DSET<CD>
postalAddress: BAG<AD>
telecomAddress: BAG<TEL>
effectiveDateRange:
IVL<TS.DATETIME>
primaryIndicator: BL
{has as a
member}
0..*
1
0..*
0..*
0..*
is qualified in
{is the location in
which the
qualification is
granted for}
0..*
handles communication for
0..*
+
+
typeCode: CD
effectiveDateRange:
IVL<TS.DATETIME>
+
+
0..*
+
+
{approves}
MaterialName
0..*
+
+
name: EN.TN
typeCode: CD
0..*
0..1
1.. *
Distributor
0..1
provides
{has as a member}
is a function
performed by
StudyRegistry
+
+
is a function performed by
0..*
0..1 +
+
+
+
+
1
Cooperativ eGroup
identifier: DSET<II>
name: EN.TN
typeCode: CD
code: CD
physicalAddress: AD
constraints
{physicalAddress Qualifier}
0..*
is a function performed by
Plac e
{functions as}
is a function
performed by
{is provided by}
0..*
0..1
name: ST
acronym: ST
is a function performed by
1
{is named by}
ProcessingSite
{functions as}
{functions as}
0..*
name s
0..*
{functions as}
+
+
+
identifier: II
typeCode: CD
primaryIndicator: BL
is located at
{is location for}
{contains} is
contained in
constraints
{Is Assigned By Exclusive
Or}
+assigned
0..*
+identifying
Material
+
+
+
+
+
is a function performed by
0..*
is managed by
{functions as}
{functions as}
{manages}
code: CD
formCode: CD
description: ST
actualIndicator: BL
effectiveDateRange:
IVL<TS.DATETIME>
0..*
is a function performed by
Serv iceDeliv eryLocation
is credentialed by
{functions as}
0..*
{is the target
for}
+
+
{credentials}
has as source
+
{is the source for}
code: CD
postalAddress:
BAG<AD>
telecomAddress:
BAG<TEL>
1
is assigned by
identifies
{is identified by}
{assigns}
is a function performed by
identifies
{functions as}
0..1
1
1
1
1
1
0..1
1
1
0..1
0..*
is delivery location for
1
1.. *
is a function performed by
OrganizationIdentifie r
is a function
performed by
{functions as}
+containing
0..1 +contained 0..1
0..*
0..1
is a function
performed by
constraints
{Is a Function Performed By
Exclusive Or}
has as t arget
fabricates
{is fabricated by}
0..*
1
0..*
{functions as}
is credentialed by
{credentials}
1.. *
0..*
{functions as}
0..1
identifier: II
leadIndicator: BL
targetAccrualNumberRange:
URG<INT.NONNEG>
accrualStatusCode: CD
accrualStatusDate:
TS.DATETIME
plannedDuration: PQ.TIME
dateRange:
IVL<TS.DATETIME>
statusCode: CD
statusDate: TS.DATETIME
1
{manufactures at}
is approved by
jurisdictionAuthorityCode: 0..1
CD
effectiveDateRange:
IVL<TS.DATETIME>
is a member of
{has jurisdiction over}
Study Conduct Sub-Domain::
StudySite
+
+
+
OrganizationRelationshi p
0..*
is a member of
is the
location
for
{has
as a member}
0..*
0..*
{has communications
handled by}
+
+
+
typeCode: CD
effectiveDateRange:
IVL<TS.DATETIME>
{is overseen by}
{is produced by}
manufactures
for
Regulatory Sub-Domain::
RegulatoryAuthority
+
0..1
0..*
oversees
is a function
performed by
{functions as}
{functions as}
0..*
Manufacturer
constraints
{Is a Function Performed By Exclusive Or}
0..*
+
+
is a function
performed by
0..*
QualifiedPerson
{functions as}
identifier: II
NotificationReceiv er
1
Cooperativ eGroupMember
0..1
Regulatory
Sub-Domain::
Ov ersightAuthority
Ov ersightCommittee
0..*
identifier: II
typeCode: CD
certificateLicenseText: ST
effectiveDateRange:
IVL<TS.DATETIME>
oversees
is a function
performed {is overseen by}
by
0..*
{functions as}
+
0..*
{functions as}
is a function
performed by
1
is produced by
functions as an outlet for
constraints
{Is a Function Performed
By Exclusive Or}
TreatingSite
is a function performed by
identifier: II
postalAddress: AD
1
telecomAddress: BAG<TEL>
effectiveDateRange:
IVL<TS.DATETIME>
receivedIndicator: BL
receivedDate:
TS.DATETIME
{assigns}
HealthcareFacility
is a function performed by
+
+
is assigned
by
is assigned by
0..1
0..1
Processor
0..*
st aff s
1
1
0..1
{functions as}
Administrativ eMemberPI
0..*
is used t o group st aff f or
{groups staff int o}
+ reprocessedDeviceCode: CD
+/ age: PQ.TIME
+ manufactureDate: TS.DATETIME
+ returnedToReprocessorDate:
TS.DATETIME
+ availableForEvaluationIndicator: BL
+ overTheCounterProductIndicator: BL
+ singleUseDeviceIndicator: BL
+ riskCode: CD
+ handlingCode: CD
::P roduct
+ codeModifiedText: ST
+ typeCode: CD
+ classCode: DSET<CD>
+ lotNumberText: ST.SIMPLE
+ expirationDate: TS.DATE.FULL
+ pre1938Indicator: BL
::Material
+ code: CD
+ formCode: CD
+ description: ST
+ actualIndicator: BL
+ effectiveDateRange: IVL<TS.DATETIME>
+ communicationModeCode: CD
+ dueDate: TS.DATETIME
+ physicianSignOffIndicator: BL
::DocumentVersion
+ officialTitle: ST
+ text: ED
+ keywordCode: DSET<CD>
+ keywordText: DSET<ST>
+ numberText: ST.SIMPLE
+ revisionReason: ST
+/ uniformResourceLocator: URL
+ bibliographicDesignation: ST
+ date: TS.DATETIME
{functions as}
0..*
identifier: II
typeCode: CD
postalAddress: AD
telecomAddress: BAG<TEL>
effectiveDateRange:
IVL<TS.DATETIME>
code: CD
date: TS.DATE.FULL
comment: ST
0..1
0..*
HealthcareProv iderGroupMember
DocumentVersionWorkflow Status
+
+
+
0..*
is a function performed by
is a function performed by
{functions as}
+
0..*
0..*
is a function performed by
st aff s
0..*
{is described by}
1
1
{functions as}
0..1
typeCode: CD
priorityNumber: INT.NONNEG
0..*
describes
is a function performed by
identifier: II
postalAddress: AD
telecomAddress: BAG<TEL>
effectiveDateRange: IVL<TS.DATETIME>
+
0..* +
+target
{is the source for}
1
constraints
{ Person-ResearchOrganization Pair Unique}
HealthcareProv ider
0..* +
+
+
+
DocumentV ersionRelationship
+source
{is the target for}
1
is a function performed by
{assigns}
HealthcareProv iderGroup
+
+
+
+
+
+
+
1 +
+
+
+/
+
+
aut hors
{is authored by}
1.. *
+
+
+
+
+
{is performed by}
0..*
constraints
{Is a Function Performed By
Exclusive Or}
1
0..1
paymentMethodCode: CD
statusCode: CD
statusDate: TS.DATETIME
confidentialityIndicator: BL
is a function
performed by
1
1
+source
{is the
source for}
0..*
StudySubj ect
1
{is the target
for}
has as source
typeCode: CD
1
0..*
1.. *
{functions as}
1 +
0..*
DocumentAuthor
{functions as}
is a function performed by
+
+
+
+
1
0..1
Document
identifies
{is identified by}
0..*
constraints
{Is Assigned By Exclusive Or}
0..*
is a function performed by
is a function performed by
identifier: II
effectiveDateRange:
IVL<TS.DATETIME>
identifier: II
typeCode: CD
primaryIndicator: BL
0..1 {assigns} 0..*
0..1
0..*
0..1
0..*
0..1
+
+
+
+
+
SystemOfRecord
+target
0..*
{functions as}
identifier: II
typeCode: CD
effectiveDateRange:
IVL<TS.DATETIME>
Person
0..1
constraints
{Distributor Qualifier}
{Processor Qualifier}
{ProcessingSite Qualifier}
has as t arget
identifier: II
+source
typeCode: CD
0..*
quantity: RTO<PQ,PQ>
confidentialityCode:
DSET<CD>
activeIngredientIndicator: BL
+target
effectiveDateRange:
IVL<TS.DATETIME>
0..*
+
+
DocumentIdentifie r
0..*
constraints
{Is Assigned By Exclusive
Or}
1.. *
{is identified by}
1
is a function performed by
ReportSubmitte r
identifier: II
typeCode: CD
effectiveDateRange:
IVL<TS.DATETIME>
primaryIndicator: BL
is assigned by
+ initials: ST
+ raceCode: DSET<CD>
+ ethnicGroupCode: DSET<CD>
+ maritalStatusCode: CD
+ educationLevelCode: CD
+ postalAddress: AD
+ telecomAddress: BAG<TEL>
+ primaryOccupationCode: CD
+ occupationDateRange: IVL<TS.DATE>
::BiologicEntity
+ name: DSET<EN>
+ administrativeGenderCode: CD
+ birthCountryCode: CD
+ birthOrder: INT.POS
+ birthDate: TS.DATETIME
+ deathDate: TS.DATETIME
+ deathIndicator: BL
+ actualIndicator: BL
{is
grouped
by}
+
+
+
+
is assigned by
0..*
identifies
Animal
+
+
+
identifier: II
reasonCode: DSET<CD>
comment: ST
constraint s
{Is Participated In By Qualifier}
is participated in by
Subj ect
constraints
{Is a Function Performed By
Exclusive Or}
identifies
1
gr oups
identifier: II
quantity: INT.NONNEG 0..*
actualIndicator: BL
ProductRelationshi p
Subj ectIdentifier
Activity
0..*
{functions as}
+ speciesCode: CD
+ breedCode: CD
+ strain: ST
+ description: ED
+ reproductiveOrgansPresentIndicator: BL
::BiologicEntity
+ name: DSET<EN>
+ administrativeGenderCode: CD
+ birthCountryCode: CD
+ birthOrder: INT.POS
+ birthDate: TS.DATETIME
+ deathDate: TS.DATETIME
+ deathIndicator: BL
+ actualIndicator: BL
+
+
+
0..1
+scoped
is scoped by
{is identified by}
{functions as}
{functions as}
0..*
is a function performed by
0..1
ProductGroup
is a function performed by
0..*
is a function performed by
AssociatedBiologicEntity
+
1
constraints
{Is a Function Performed By
Exclusive Or}
+performed
{functions as}
+scoping
0..*
0..1
is participated in by
{participates in}
is a function performed by
1
{functions as}
0..1
name: EN.TN
typeCode: CD
quantity: INT.NONNEG
actualIndicator: BL
+ codeModifiedText: ST
+ typeCode: CD
+ classCode: DSET<CD>
+ lotNumberText: ST.SIMPLE
+ expirationDate: TS.DATE.FULL
+ pre1938Indicator: BL
::Material
+ code: CD
+ formCode: CD
+ description: ST
+ actualIndicator: BL
1.. * + effectiveDateRange: IVL<TS.DATETIME>
0..1
identifier: DSET<II>
subgroupCode: CD
statusCode: CD
statusDate: TS.DATETIME
0..*
is a function performed by
BiologicEntityGroup
+
+
+
+
gr oups
{is
grouped
by}
+
+
+
+
{functions as}
0..*
is a function performed by
anatomicSiteCode: CD
0..1
anatomicSiteLateralityCode: CD
0..1
0..1
1
1
1
1
{receives delivery at}
0..1
{is identified by}
0..1
+assigning
0..1
+identified
1
0..1
1
0..1
1
1
Cosmeti c
FoodProduc t
Pack age
+ stabilityDuration:
IVL<TS.DATETIME>
::P roduct
+ codeModifiedText: ST
+ typeCode: CD
+ classCode: DSET<CD>
+ lotNumberText:
ST.SIMPLE
+ expirationDate:
TS.DATE.FULL
+ pre1938Indicator: BL
::Material
+ code: CD
+ formCode: CD
+ description: ST
+ actualIndicator: BL
+ effectiveDateRange:
IVL<TS.DATETIME>
+ stabilityDuration:
IVL<TS.DATETIME>
::P roduct
+ codeModifiedText: ST
+ typeCode: CD
+ classCode: DSET<CD>
+ lotNumberText:
ST.SIMPLE
+ expirationDate:
TS.DATE.FULL
+ pre1938Indicator: BL
::Material
+ code: CD
+ formCode: CD
+ description: ST
+ actualIndicator: BL
+ effectiveDateRange:
IVL<TS.DATETIME>
+ capTypeCode: CD
+ capacityQuantity: PQ
+ handlingCode: CD
::P roduct
+ codeModifiedText: ST
+ typeCode: CD
+ classCode: DSET<CD>
+ lotNumberText:
ST.SIMPLE
+ expirationDate:
TS.DATE.FULL
+ pre1938Indicator: BL
::Material
+ code: CD
+ formCode: CD
+ description: ST
+ actualIndicator: BL
+ effectiveDateRange:
IVL<TS.DATETIME>
Organization
+
+
+
+
+
+
name: DSET<EN.ON>
typeCode: CD
description: ST
postalAddress: AD
telecomAddress: BAG<TEL>
actualIndicator: BL
April 17-18, 2012
0..*
MaterialIdentifier
is assigned by
0..1
SALUS Technical Kickoff Meeting
{assigns}
0..*
+
+
identifier: II
typeCode: CD
Biologi c
+ riskCode: CD
+ handlingCode: CD
+ stabilityDuration:
IVL<TS.DATETIME>
::P roduct
+ codeModifiedText: ST
+ typeCode: CD
+ classCode: DSET<CD>
+ lotNumberText: ST.SIMPLE
+ expirationDate: TS.DATE.FULL
+ pre1938Indicator: BL
::Material
+ code: CD
+ formCode: CD
+ description: ST
+ actualIndicator: BL
+ effectiveDateRange:
IVL<TS.DATETIME>
Drug
+ riskCode: CD
+ handlingCode: CD
+ stabilityDuration:
IVL<TS.DATETIME>
::P roduct
+ codeModifiedText: ST
+ typeCode: CD
+ classCode: DSET<CD>
+ lotNumberText:
ST.SIMPLE
+ expirationDate:
TS.DATE.FULL
+ pre1938Indicator: BL
::Material
+ code: CD
+ formCode: CD
+ description: ST
+ actualIndicator: BL
+ effectiveDateRange:
IVL<TS.DATETIME>
9
Sample UML Model from BRIDG
Study Conduct Sub Domain
April 17-18, 2012
SALUS Technical Kickoff Meeting
10
Creating BRIDG Ontology

We have created a complete RDF representation of the
latest BRIDG DAM (v3.0.3)





UML -> XMI -> XSD -> RDF conversion
Utilization of several tools (Enterprise Architect,Visual
Paradigm, Topbraid Composer)
Manual fine-tuning
It was quite an effort…
In the end, the RDF representation of the BRIDG DAM is
the core of the initial SALUS Semantic Framework, which
we call SALUS core ontology

Note that SALUS core ontology has a living and expanding
nature
April 17-18, 2012
SALUS Technical Kickoff Meeting
11
BRIDG Ontology
April 17-18, 2012
SALUS Technical Kickoff Meeting
12
(B) Mapping Different Content Models to
BRIDG DAM Ontology (Common Ontology)

Medical summaries available through XML files


First we need to create semantic models of these content
models



Schemas provided through XSD
XSD2RDF Normalization Tools can be used
We created RDF model of HL7 CDA and CEN 13606
Then this semantic model of the Content Models need to
be mapped to the Common Ontology

So that mapping definitions can be used to translate medical
summary instances as individuals f SALUS Common Ontology
April 17-18, 2012
SALUS Technical Kickoff Meeting
13
(B) Mapping CCD “Past Medical History” section to
“PerformedMedicalConditionResult” class in
BRIDG
April 17-18, 2012
SALUS Technical Kickoff Meeting
14
SPINMap Formalism

SPINMap



SPARQL-based language to represent mappings between RDF/OWL ontologies
mappings can be used to transform instances of source classes into instances of target
classes
Mainly uses the SPARQL CONSTRUCT


particularly useful to define rules that map from one graph pattern (in the WHERE clause) to
another graph pattern
Based on SPIN (SPARQL Inferencing Notation)



W3C Submission
makes it easy to associate mapping rules with classes, and SPIN templates and functions can be
exploited to define reusable building blocks for typical modeling patterns
Provides a vocabulary: collection of properties and classes that can be used to link RDFS and OWL
classes with SPARQL queries



SPINMap vocabulary (http://spinrdf.org/spinmap)


the class ex:Rectangle can define a property spin:rule that points to a SPARQL CONSTRUCT query
that computes the value of ex:area based on the values of ex:widthand ex:height.
the property spin:constraint may link the class ex:Square with a SPARQL ASK query that verifies that
the width and height values are equal
A collection of reusable design patterns that reflects typical best practices in ontology mapping
Can be executed in conjunction with other SPARQL rules with any SPIN engine
April 17-18, 2012
SALUS Technical Kickoff Meeting
15
SPINMap vocabulary

Context:

Groups together multiple mappings so that they have a shared target
resolution algorithm



The source class of the mapping
The target class of the mapping
The expression that delivers the target of the mapping. This expression can
reference the variable ?source for the source resource, and the variable
?targetClass for the type of the target


TargetFunction

Class of SPIN functions used to get the target resource of a mapping


Usually expressed through a TargetFunction
Conditional Construct Statements…
SPIN Rules

Bound to classes and contexts


To map the datatype/object properties of the source-target classes
Can make use of SPIN: Functions

Can make use of the results of the mappings defined through other contexts..
April 17-18, 2012
SALUS Technical Kickoff Meeting
16
April 17-18, 2012
SALUS Technical Kickoff Meeting
17
Sample Mapping
RecordTarget-StudySubject
RecordTarget
performingBiologicalEntity = targetRecource
(RecordTarget, RecordTarget-Person)
-hasPatientRole
StudySubject
-performingBiologcal Entity
PatientRole
Person
RecordTarget-Person
-hasPatient
Patient
-hasRaceCode [CE]
-hasBirthTime [TS]
-hasAdministrativeGenderCode[CE]
CS
-dtype:Value
CE
-hasCodeSystem [UID]
-hasCode [CS]
-hasCodeSystemName [string]
CE
-hasCodeSystem [UID]
-hasCode [CS]
-hasCodeSystemName [string]
UID
-dtype:Value
CS
-dtype:Value
• raceCode= targetRecource
(RecordTarget, RecordTarget-CD-1)
•administrativeGenderCodeCode= targetRecource
(RecordTarget, RecordTarget-CD-2)
•birthDate=targetRecource
(RecordTarget, RecordTarget-TS)
CD
RecordTarget-CD-1
• code= targetRecource
(RecordTarget, RecordTarget-Code)
• codeSystem= targetRecource
(RecordTarget, RecordTarget-Uid)
• codeSystemName= copy(
(RecordTarget.hasPatientRole.hasPatient.hasRaceCod
e.hasCodeSystemName)
-dtype:Value [string]
-dtype:Value
-codeSystem
-code
-codeSystemName
Code
RecordTarget-Code
UID
TS
-raceCode
-birhDate
-administrativeGenderCode
•dtype:Value= copy(
(RecordTarget.hasPatientRole.hasPatient.hasRaceCod
e.hasCode.dtype:value)
-dtype:Value
Uid
RecordTarget-Uid
•dtype:Value= copy(
(RecordTarget.hasPatientRole.hasPatient.hasRaceCod
e.hasCodeSystem.dtype:value)
-dtype:Value
TS
RecordTarget-TS
•value=targetRecource
(RecordTarget, RecordTarget-Class1)
RecordTarget-Class1
April 17-18, 2012
•dtype:Value=
copy( Kickoff Meeting
SALUS
Technical
(RecordTarget.hasPatientRole.hasPatient.hasBirthTim
e.dtype:value)
-value
Class1
-dtype:value
18
(D) Clinical data instance translation
procedure
April 17-18, 2012
SALUS Technical Kickoff Meeting
19
(D & F) Importing & Exporting
Clinical Documents
Ontology Mapping
Definition
HL7 Study
Design
RMISM
as an
Ontology
Source Ontology
BRIDG
DAM
Ontology
Target Ontology
HL7 Study
Design
XSD
Instance
HL7 Study
Design
Ontology
Instance
Ontology
Mapping
Engine (SPIN
Engine)
BRIDG Study
Design DAM
Ontology
Instance
Study Design Source Ontology
Instance
(Native XML
conformant to (Study Design in
HL7 study HL7 study Design
Ontology)
Design RMIM)
SPIN Map
(SPARQL Queries
attached to Classes)
Ontology Mapping
Definition
CDISC
Study
Design
ODM
as an
Ontology
Source Ontology
BRIDG
DAM
Ontology
Target Ontology
1. Defining the Mapping
April 17-18, 2012
CEN
13606
XSD
Instance
CDISC
Study
Design
Ontology
Instance
Ontology
Mapping
Engine (SPIN
Engine)
BRIDG Study
Design DAM
Ontology
Instance
Target Ontology
Study Design
Instance
(Native XML
(Study Design
conformant to in the CDISC SDM
CDISC SDM ODM)
Ontology)
2. Instance Translation
SALUS Technical Kickoff Meeting
20
(E) Aligning the standards harmonized by BRIDG
(Data Sets) with the SALUS Core Ontology

Clinical Data Acquisition Standards Harmonization


a link between the study data collected through eCRF Forms and the study data submitted to the
regulatory bodies as SDTM datasets
a limited set of structured data used for any Clinical Trial, regardless of research sponsors or therapy
areas

16 domains








Sites have always been asked to complete non-standard CRFs while patients are performing daily
assessments, and CRFs are expected to be completed on time and accurately by the site


Adverse Events (problems)
Medications (prior and concomitant)
Demographics and subject characteristics
Medical History
Vitals/ Physical Exam
ECG test results
Lab results
variety of CRF questions and layouts is almost unlimited
The current 16 CDASH CRFs are associated with standard SDTM mappings and standard CDISC
controlled terminology


The eCRF design time is shortened as CDASH eCRF forms can be pulled out of the EDC library as and
when they are needed
Standard CDASH CRFs can be transformed to standard SDTM datasets using standard extract transform
load (ETL) code
April 17-18, 2012
SALUS Technical Kickoff Meeting
21
CDASH Data set example
April 17-18, 2012
SALUS Technical Kickoff Meeting
22
How CDASH Variables can be used
within ODM messages
April 17-18, 2012
SALUS Technical Kickoff Meeting
23
(E) Aligning the standards harmonized by
BRIDG with the SALUS Core Ontology

In the first case, the mappings between vocabularies termed as
“data sets” (as in the case of CDASH variables) and the
BRIDG based core ontology is addressed


This is quite straightforward, since it is possible to write SPARQL
queries on top of BRIDG DAM to retrieve the requested CDASH
variable
We have developed a library of sample SPARQL queries to extract
several CDASH variables
April 17-18, 2012
SALUS Technical Kickoff Meeting
24
An example SPARQL to collect fields in
Medical History Data set in CDASH
PREFIX sp: <http://spinrdf.org/sp#>
PREFIX fn: <http://www.w3.org/2005/xpath-functions#>
PREFIX bridg: <http://bridgmodel.org/dam/3.0.3#>
PREFIX bfn: <http://www.salus.eu/bridg-functions#>
SELECT ?MHONGO ?MHSTDAT ?MHENDAT ?MHTERM ?MHTERM_CD ?MHTERM_CS ?MHTERM_CS_NAME
WHERE {
?p a bridg:PerformedMedicalConditionResult .
OPTIONAL {
?p bridg:medicalHistoryIndicator ?mhi .
?mhi bridg:value ?MHONGO .
}
OPTIONAL {
?p bridg:occurrenceDateRange ?odr .
?odr bridg:low ?odrlow .
BIND (bfn:getTSValue(?odrlow) as ?MHSTDAT) .
?odr bridg:high ?odrhigh .
BIND (bfn:getTSValue(?odrhigh) as ?MHENDAT) .
?odr bridg:value ?odrval .
BIND (bfn:getTSValue(?odrval) as ?midval) .
BIND (if( (!bound(?MHSTDAT) && !bound(?MHENDAT)), ?midval, ?MHSTDAT) as ?MHSTDAT) .
}
OPTIONAL {
?p bridg:value ?val .
BIND (bfn:getCDCode(?val) as ?MHTERM_CD) .
BIND (bfn:getCDDisplayName(?val) as ?MHTERM) .
BIND (bfn:getCDCodeSystem(?val) as ?MHTERM_CS) .
BIND (bfn:getCDCodeSystemName(?val) as ?MHTERM_CS_NAME) .
}
}
April 17-18, 2012
SALUS Technical Kickoff Meeting
25
How and Where these SPARQLs
can be exploited


Study Design Model is represented in CDISC ODM where it is
also annotated with CDASH variables to specify the data to be
collected through CRFs
The Medical Summaries are collected through SALUS


The EDC system can automatically parse the Study Design
Model annotated with CDASH variables



They are mapped to SALUS Common Ontology instances
query the knowledge base already containing the medical history of
the patient in the common ontology
This is achieved using the pre-defined SPARQL queries for CDASH
variables
This eliminates static XSLT based mappings between Medical
Histories and CDASH annotated ODM messages representing
CRFs (as proposed by IHE CRD)…
April 17-18, 2012
SALUS Technical Kickoff Meeting
26
(D) Exploiting terminology systems within
the SALUS Semantic Framework

Imported the following terminology systems from BioPortal into the SALUS Knowledge Base








ICD-9: 21,669 terms
ICD-10: 12,318 terms
WHO-ART: 1,724 terms
MedDRA: 69,389 terms
National Drug File (NDFRT): 40,104 terms
SNOMEDCT Clinical findings (97,139 terms) + Pharmaceuticals / biologic products (17,100 terms)
RxNorm: 194,176 terms
Human Disease Ontology (DOID): 8,574 terms




It has references to other Ontologies such as ICD and SNOMED CT through DbXref property to indicate
equivalances
Those are processed to create additional Mapping Definitions
And, 133,825 unique code mappings
Not very straight forward

Usually it is not possible to download the full ontology through a singe Rest Service due to timeouts




The class names in an ontology are collected
These classes are retrieved from Bioportal seperately (100 class each time)
Then these subontologies are merged
Some of the Class UIDs were incorrect (for ICD), they are corrected manually
April 17-18, 2012
SALUS Technical Kickoff Meeting
27
(D) Aligning the Common Ontology
with Terminology Ontologies


To be able to automatically map the clinical data using different terminology systems
to one another, it is necessary to link the coded terms in SALUS core ontology
instances representing clinical data collected from participating sites with the
SALUS terminology ontology resources, and to utilize terminology reasoning
while querying the collected clinical data.
Two heuristics that we have adapted on top of BioPortal ontologies:


We automatically create the instances of BioPortal ontology classes and copy all non-rdfs
and non-owls properties from the class definitions to the instances, to prevent OWL-Full
ontologies
Within a term present in a terminology ontology retrieved from BioPortal, the original
terminology system name is implicitly given in the full URL of the term




However, we need to immediately get the encapsulating terminology system of any term
Therefore, we automatically run a SPARQL rule to add a “skos:inScheme” property to each
instance in the terminology ontologies that we retrieve from BioPortal.
We maintain an upper ontology (SALUS Terminology Upper Ontology), in which the major
terminology systems used in our system are represented as the individuals of
“skos:ConceptScheme” class.
This way, we are able to execute a SPARQL rule to automatically bind a “CD”
instance (a coded value) in BRIDG model to the corresponding BioPortal ontology
instance via “salus:terminologyRef ” property
April 17-18, 2012
SALUS Technical Kickoff Meeting
28
CONSTRUCT {
?this salus:terminologyRef ?codeRef .
}
WHERE {
?this p3.0:code ?code .
?code dtype:value ?codeValue .
?this p3.0:codeSystem ?codeSystem .
?codeSystem dtype:value
Attached to CD class ?codeSystemRef .
BIND (str(?codeSystemRef) AS ?csr) .
?codeOIDRef salus:oid ?csr . ?codeRef
skos:inScheme ?codeOIDRef.
BIND (str(?codeValue) AS ?cv) .
PerformedMedica
?codeRef skos:notation ?cv .
lConditionResult
}
value
dtype:value: 2.16.840.1.113883.6.96
CD
codeSystem
code
skos:ConceptSche
me
salus:MedDR
A
rdf:type
rdf:type
SNOMEDCT
salus:SNOMED
CT
salus:oid: 2.16.840.1.113883.6.96
<http://purl.bioontology.org/ontol
ogy/SNOMEDCT/102572006 >
????
rdfs:subClassOf
skos:inScheme
<http://purl.bioontology.org/ontolo
gy/SNOMEDCT/102574007>
rdf:type
dtype:value: 102574007
<http://purl.bioontology.org/ontology/
SNOMEDCT#Ins_102574007>
salus:terminologyRef
skos:notation: 102574007
√
Part A: A part of the SALUS core ontology based on BRIDG DAM
April 17-18, 2012
salus:ICD9
salus:LOINC
Uid
Code
rdf:type
rdf:type
Part B: A part of SNOMED CT ontology from Bioportal
SALUS Technical Kickoff Meeting
29
Exploiting the Initial SALUS
Semantic Framework

We have envisioned two use cases to
1.
automatically fill in eCRFs
2.
facilitate safety studies on EHR systems
April 17-18, 2012
SALUS Technical Kickoff Meeting
30
The Knowledge Base



All the semantic artifacts are hosted in a knowledge base
The main consideration for the choice of the SALUS knowledge base is its
performance, which is related directly to the complexity of the reasoning
process
Our reasoning requirements:

Subsumption reasoning: Crucial to deduce matching coded terms that are
aligned with different terminology ontology class instances, which in fact have the
same ancestor in the terminology ontology


Reasoning on equivalence of classes: In SALUS, the mappings of the terms in
different terminology ontology classes to each other are represented through
“owl:equivalentClass” property. We should be able to classify individuals of a
class also as the individuals of its equivalent classes.


“Acute heart failure” is a child of “heart failure” in SNOMED CT
Both MedDRA:10019279 and SNOMEDCT:84114007 mean “heart failure”
Reasoning on transitivity of properties: “owl:equivalentClass” property is
inherently a transitive property. It should be possible for us to process transitive
equivalences, in order to classify individuals of a class also as the individuals of its
equivalent classes that are deduced to be equivalent through transitivity.

When we calculate the transitive closure of the 133,825 unique code mappings that
we retrieved from the BioPortal, the number of mappings increase to 186,712
April 17-18, 2012
SALUS Technical Kickoff Meeting
31
The Knowledge Base


Clearly all the RDF and OWL-DL reasoners support all our
reasoning requirements and much more.
However, due to the very large number of triples (around 4.7
million) to be reasoned on in the SALUS knowledge base, we have
chosen Virtuoso.



Virtuoso supports a limited reasoning capability when compared to
other RDF and OWL-DL reasoners; however the limited set of
constructs supported includes rdfs:subClassOf, rdfs:subPropertyOf,
owl:sameAs, owl:transitiveProperty and owl:equivalentClass, which fully
address the SALUS Framework reasoning requirements.
In addition, we benefit from Protege with Fact++ reasoner support, for
calculating the transitive closure only via the “owl:equivalentClass”
property
It was not possible to run DL reasoning with other reasoners (Jena,
OWLim, Fact++, Pellet, Hermit) when we load the BioPortal
ontologies
April 17-18, 2012
SALUS Technical Kickoff Meeting
32
Q1: All patients with history of “Edema of
Legs”
define input:inference "salus5"
prefix bridg: <http://bridgmodel.org/dam/3.0.3#>
prefix salus: <http://www.salus.eu/ontology/clinical#>
prefix rdfs: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
prefix dtype: <http://www.linkedmodel.org/schema/dtype#>
prefix skos: <http://www.w3.org/2004/02/skos/core#>
SELECT ?subject ?subjectBirthDate ?ProblemCodeValue ?ProblemcodeSystemName ?ProblemDisplayName ?StartingDate ?EndDate ?ProblemDate WHERE {
OPTIONAL{
?dateRange bridg:value ?datevalue. }
OPTIONAL{
?datevalue bridg:value ?ProblemDate.}
OPTIONAL{
?dateRange bridg:high ?high. }
OPTIONAL{
?high bridg:value ?EndDate.}
OPTIONAL{
?dateRange bridg:low ?low.}
Rest are for binding values to
variables in the results set
OPTIONAL{
?low bridg:value ?StartingDate.}
?performedObservationResult bridg:occurrenceDateRange ?dateRange.
?CodedValue bridg:codeSystemName ?ProblemcodeSystemName.
?ProblemCode dtype:value ?ProblemCodeValue.
?CodedValue bridg:code ?ProblemCode.
?birthdatevalue dtype:value ?subjectBirthDate.
?birthdate bridg:value ?birthdatevalue.
?performingBiologicalEntity bridg:birthDate ?birthdate.
?subject bridg:performingBiologicEntity ?performingBiologicalEntity.
?performedObservation bridg:involvedSubject ?subject.
?performedObservation bridg:resulted ?performedObservationResult.
?terminologyCode <http://www.w3.org/2000/01/rdf-schema#label> ?ProblemDisplayName.
?performedObservationResult bridg:value ?CodedValue.
?CodedValue salus:terminologyRef ?terminologyCode.
?terminologyCode rdfs:type
<http://purl.bioontology.org/ontology/MDR/10014239>
}
April 17-18, 2012
SALUS Technical Kickoff Meeting
Only condition
33
Available Sample Patient
Documents in the Knowledge Base
Example
Patient edema of
Summari ankle
es
(snomed)
edema of
foot
(snomed)
heart
edema of leg edema failure
(snomed)
(whoart) (ICD)
Code
26237000 102576009 102574007
1
X
2
X
3
X
4
5
X
6
X
7
X
8
X
9
X
10
(13606)
X
401
428
heart
failure
unspecifie
d (ICD)
428.9
heart
acute H. chronic heart primary
Dipyridam
failure F.
H. F.
failure pulmonary
ol 50MG
(snomed (snomed (snomed (whoart hypertensio pph (icd TAB
)
)
)
)
n (snomed) 9)
RxNorm
8411400 5667500 4844700
7
7
3
496 26174007 416
197622
X
X
X
X
X
X
X
None of the medical histories are coded with MedDRA Term:10014239
April 17-18, 2012
SALUS Technical Kickoff Meeting
34
5. SELECT ?ProblemDisplayName WHERE {
?terminologyCode <http://www.w3.org/2000/01/rdf-schema#label> ?ProblemDisplayName
?performedObservationResult bridg:value ?CodedValue.
?CodedValue salus:terminologyRef ?terminologyCode.
?terminologyCode rdfs:type
<http://purl.bioontology.org/ontology/MDR/10014239>
1. Through terminology system
ontologies and mappings downloaded
from BioPortal
2. Instances are created to avoid OWL
Full reasoning
type
MedDRA: 10014239
Edema of legs
equivalantClass
equivalantClass
MedDRA:10030105
Oedema legs
equivalantClass
type
type
WHOART:0401
Edema
type
SNOMEDCT:102574007
Edema of leg
subclass
SNOMEDCT:102574007
Instance
salus:terminologyRef
subclass
SNOMEDCT:26237000
Edema of ankle
SNOMEDCT: 102576009
Edema of foot
type
type
SNOMEDCT:26237000
Instance
Medical History 3
type
SNOMEDCT: 102576009
Instance
salus:terminologyRef
salus:terminologyRef
WHOART:0401
Instance
salus:terminologyRef
Medical History 4
April 17-18, 2012
Medical History 2
Medical History 1,5,6,7,8,9
4. Through equivalence, subsumption
and transitivity reasoning supported by
Virtuoso
SALUS Technical Kickoff Meeting
3. After Medical Histories are uploaded
in SALUS Common Ontology, through
the Rule attached to CD Class, these
references are added…
35
Facilitating safety studies on EHR
systems




Q1: All patients with history of “Edema of Legs”
Q2: All patients with history of “Edema of Legs” AND “Heart Failure”
Q3: All patients with history of “Edema of Legs” AND history of “primary
pulmonary hypertension ”
Q4: All patients with history of “Edema of Legs” AND actively using a
“vasodilating agent”


similar
Vasodilating agent: SNOMEDCT 58944007
Instance 8: Patient is using DIPYRIDAMOLE 50MG TAB
(RxNorm: 197622)

SNOMEDCT:58944007 <-- subClassOf – SNOMEDCT: 66859009 <equivalentClass -> NDF: C24056--ingredientof NDF:C39726 <equivalentClass -> RxNorm: 197622
April 17-18, 2012
SALUS Technical Kickoff Meeting
36
define input:inference "salus5"
prefix bridg: <http://bridgmodel.org/dam/3.0.3#>
prefix salus: <http://www.salus.eu/ontology/clinical#>
prefix rdfs: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
prefix dtype: <http://www.linkedmodel.org/schema/dtype#>
prefix owl: <http://www.w3.org/2002/07/owl#>
SELECT ?subject ?subjectBirthDate ?MedicationCodeValue ?MedicationDisplayName WHERE {
?termCode rdfs:type <http://purl.bioontology.org/ontology/SNOMEDCT/58944007>.
{?termCode salus:ingredientOf ?drugClassA. ?drugClassA owl:equivalentClass ?drugClassB. }
UNION {?termCode salus:ingredientOf ?drugClassB}
?drugClassA owl:equivalentClass ?drugClassB.
Not only medication’s prodcut code,
but also active ingredients are checked
Through domain specific rules
?medTerminologyCode rdfs:type ?drugClassB
?medTerminologyCode <http://www.w3.org/2000/01/rdf-schema#label> ?MedicationDisplayName.
?CodedValue salus:terminologyRef ?medTerminologyCode.
?classCode bridg:item ?CodedValue.
?product bridg:classCode ?classCode.
?agenta bridg:performing ?product.
?performedSubstanceAdministration bridg:usedConcomitantAgent ?agenta.
?performedSubstanceAdministration bridg:involvedSubject ?subject.
?subject bridg:performingBiologicEntity ?performingBiologicalEntity.
?performingBiologicalEntity bridg:birthDate ?birthdate.
?birthdate bridg:value ?birthdatevalue.
?birthdatevalue dtype:value ?subjectBirthDate.
Query parameters are mapped to related fields,
like date of birth
Medication’s coded representation is retrieved as
medTerminologyCode
?CodedValue bridg:code ?MedicationCode.
?MedicationCode dtype:value ?MedicationCodeValue.
?performedObservation2 bridg:involvedSubject ?subject.
?performedObservation2 bridg:resulted ?performedObservationResult2.
?performedObservationResult2 bridg:value ?CodedValue2.
?CodedValue2 salus:terminologyRef ?terminologyCode2.
?terminologyCode2 rdfs:type
SALUS Technical
April 17-18, 2012
<http://purl.bioontology.org/ontology/MDR/10014239>
}
Patients with History of “Edema of Legs”
Kickoff Meeting
37
Performance Evaluation



On an average desktop computer (Intel Core 2 Duo 3Ghz CPU and 4 GB
RAM), the semantic mediation of a medical history in CCD format to
SALUS core ontology takes approximately 110 seconds.
An example SPARQL query to check the underlying conditions of patients
can be executed on the knowledge base hosting more than 4.7 million
triples under 7 seconds.
These results are quite encouraging for a real-life deployment of the initial
Semantic Framework.
April 17-18, 2012
SALUS Technical Kickoff Meeting
38
Thank you...

similar documents