Standards for Synthetic Biology

Report
Developing Standards:
Case Studies
www.sys-bio.org
www.sbml.org
www.sbolstandards.org
blog.analogmachine.org
Herbert M Sauro
Dept. of Bioengineering
University of Washington, Seattle, WA
[email protected]
1
Importance of Standards
Imagine a world where:
Each company made its own incompatible nut, bold and screw?
Every town had its own way to measure time.
Every internet provider used different protocols for the ‘TCP/IP’ stack,
email, web etc.
and so on
Standards are vital for the normal functioning of society
2
At least two ways to start a standard:
1. Top-down: institutionalized stick and carrot
2. Grass Roots
3
Two Examples
SBML: Systems Biology Markup Language
SBOL: Synthetic Biology Open Language
4
Simulation of Computational Models
Simulation
5
Why? Study Perturbations
Apoptosis
Change the activity of a
Protein, e.g. P53 by
adding an inhibitor
http://www.sapphirebioscience.com
What effect does this have on
Cell death and/or proliferation?
There may be multiple paths or multiple effects
6
How it started:
SCAMP and Gepasi: 80/90s
X
SCAMP
7
Exchange of Computational Models
In 1999/2000 a project was started at Caltech
with initial funding from Japan to devise
an interchange language:
SBML: Systems Biology Markup Language
8
SBML
SBML: Systems biology Markup Language
Used to represent homogenous
multi-compartmental Biochemical Systems
9
SBML in a Nutshell
“Systems Biology Markup Language”
• A machine-readable format for representing
computational models in systems biology
• Domain: systems of biochemical reactions
• Specified using XML
• Components in SBML reflect the natural
conceptual constructs of the domain
• Now over 200 tools use SBML
10
SBML in a Nutshell
“Systems Biology Markup Language”
•
Simple Compartments (well stirred reactor)
•
Internal/External Species
•
Reaction Schemes
•
Global Parameters
•
Arbitrary Rate Laws
•
DAEs (ODE + Algebraic functions, Constraints)
•
Physical Units/Model Notes
•
Annotation – extension capability
• Events
11
SBML – Systems Biology Markup Language
12
Model Exchange Standards: SBML,
CellML
SBML is primarily a way to describe the biology of
cellular networks from which the mathematical models
can be automatically derived.
CellML is a math based description from which the
underling biological can be inferred.
13
There many modeling
software tools that use SBML
www.sbml.org
14
SBML
Ecosystem
SBML
Unambiguous
Model
Exchange
Diagrams
Databases
Journals
Semantic
Annotations
SEDML: Simulation Experiment Description Language
SBGN : Systems Biology Graphical Notation
Simulator
Comparison and
Compliance
15
Model repositories
Nicolas Le Novere
BioModels.net
As of Sep 2011:
366 curated
models
398 uncurated
models.
http://www.ebi.ac.uk/biomodels/
16
MIRIAM: Minimum Information Requested in the
Annotation of biochemical Models
MIRIAM is not a file format but a minimum specification on how a
model should be made available to the community:
Reference correspondence – encoding a model in a
recognized public standardized machine-readable format.
Attribution annotation - A model has to provide the
citation of the reference description, lists its creators,
and be attached to some terms of distribution.
External resource annotation - each component of a model
must be annotated to allow its unambiguous identification.
17
Semantic Annotations
1. SBO: Systems Biology Ontology (Quantitative terms)
2. MIASE: The Minimum Information About a
Simulation Experiment
3. TEDDY: The Terminology for the Description of
Dynamics
4. KiSAO: Simulation Algorithm Ontology
5. Missing: An audit trail of a modeling process.
18
SBO: Systems Biology Ontology
1.
[Term] id: SBO:0000002 name: quantitative parameter def: "A number representing a
quantity that defines certain characteristics of systems or functions. A parameter may
be part of a calculation, but its value is not determined by the form of the equation
itself, and may be arbitrarily assigned." [] relationship: part of SBO:0000000 ! Systems
Biology Ontology
2.
[Term] id: SBO:0000012 name: mass action kinetics def: "The Law of Mass Action, first
expressed by Waage and Guldberg in 1864 (Waage, P., Guldberg, C. M. Forhandlinger:
Videnskabs-Selskabet i Christiana 1864, 35) states that…..." [] is a: SBO:0000001 ! rate
law.
Terms can be queried programmatically via a web service
19
Systems Biology Ontology in SBML
<reaction sboTerm="SBO:0000062">
continuous framework
<listOfReactants>
<speciesReference species="S" sboTerm="SBO:0000015" />
</listOfReactants>
<listOfProducts>
<speciesReference species="P" sboTerm="SBO:0000011" />
</listOfProducts>
<listOfModifiers>
<speciesReference species="E" sboTerm="SBO:0000014" />
</listOfModifiers>
<kineticLaw sboTerm="SBO:0000031">
<listOfParameters>
substrate
product
enzyme
Briggs-Haldane equation
<parameter id="Km" sboTerm="SBO:0000027" />
<parameter id="kp" sboTerm="SBO:0000025" />
</listOfParameters>
<math xmlns="http://www.w3.org/1998/Math/MathML">
<apply> <divide/> <apply>
<times />
<ci>E</ci>
<ci>kp</ci>
<ci>S</ci>
</apply>
<apply>
<plus />
<ci>Km</ci>
<ci>S</ci>
</apply> </apply> </math> </kineticLaw> </reaction>
Michaelis constant
catalytic rate constant
20
European Bioinformatics Institute
Application: Simulator Compliance
SBML Compliance
# Simulation Results returned for 150 models
VCell
SBToolBox2
SBML ode Solver
roadRunner
Oscill8
MathSBML
Jsim
Jarnac
COPASI
BioUML
0.00
50.00
100.00
150.00
21
The Results
% Agreement of Simulation Results
80
70
Number of Models
60
50
40
30
20
10
0
0% to 20%
20% to 40%
40% to 60%
60% to 80%
80% to 99%
100%
22
Other Proposed Standards
Standardizing the diagrammatic notation
http://www.sbgn.org/Main_Page
23
What we all learned
24
Fact:
Developing a standard has both technical
as well sociological challenges.
The sociological challenges may be greater, :(
25
Rule #1:
There must be a problem (i.e an actual
need) that a particular community wants
to solve.
• Clear scope
• Covers what is needed
• Doesn’t force you to deal with things
that are not needed
26
Rule #2:
Building a community from day one is
of the utmost importance.
•
•
•
•
Build Trust
Build Consensus
Build Enthusiasm
Build Ownership
27
Rule #3:
For a standard to succeed, the central players
must provide tools and documentation to help
the community use the standard.
• Easy to implement
• Low ‘buy in’ cost
28
Rule #4:
The process is long and drawn out, far beyond
the normal patience of review panels and
funding agencies.
29
Summary
Initial cost for the SBML development:
Initial version was funded by JST (roughly 250K direct per
year for three years). Could probably get by with 150K
direct. This funds a core team which is involved in:
1. Documentation
2. Organizing two workshops per year
3. Developing the initial source libraries
4. Develop a governance model
5. Follow discussions on mailing lists/workshops to address
the needs of the community
6. Maintain civility during discussions !
30
Centralized development of supporting
software libraries:
1) Prevented the standard from diverging
2) As extensions or modifications were agreed to
by the community it was relatively easy for
platform developers to incorporate the changes
into their software.
3) Software developed in C/C++ to make the
library cross-language (Java came later).
31
Current work of my group:
Model Reproducibility
Biology
Data
Simulation
Tool
SBML
SEDML
Data
SEDML: What you did
with the model
32
Synthetic Biology
33
Synthetic biology
“The design and construction of new
biological entities such as enzymes,
genetic circuits, cells, and organs or the
redesign of existing biological systems.”
Drew Endy (Stanford)
34
The Immediate Need
Take any current publication on a synthetic
circuit and try to reproduce it, let me know how
you get on.
35
GFP (RFU)
The long term vision: Design, Build, Test
time
Testing/
Analysis
Build
Specification
Design
36
Synthetic Biology Open Language
(SBOL) – SBOL Semantic
Fabricate
Synthetic
Biologist A
SBOL
visual
DNA
Components
B0015
Engineer
DNA
Component
Synthetic
Biologist B
1-80
81-88
89-129
Sequence
Annotation
B0010
BioBrick Scar
B0012
Terminator
BioBrick Scar
Terminator
New
device
semantic
describe and send
37
Some History
The synthetic biology standardization effort was started with a
grant from Microsoft in 2008 (100K). The first meeting was held
in Seattle.
The first draft proposal was called PoBoL but has since been
renamed to SBOL – Systems Biology Open Language
Since then we have (somehow) managed to organize two
meetings a year since 2008, next one in Jan 2012 in Seattle.
38
Overall Aim of the
Standardization Effort
To support the synthetic biology workflow:
1.
2.
3.
4.
5.
6.
Laboratory parts management
Simulation/Analysis
Design
Codon optimization
Assembly
Repositories - preferably distributed
39
Overall Aim of the
Standardization Effort
Specifically:
• To allow researches to electronically exchange
designs with round-tripping.
• To send designs to bio-fabrication centers for
assembly.
• To allow storage of designs in repositories and
for publication purposes.
40
Synthetic Biology
Synthetic Biology is Engineering,
i.e it is not biology*
Design
Build
Test
* Beware of sending synthetic biology grant proposals to a biology panel
41
Synthetic Biology
Synthetic Biology is Engineering,
i.e it is not biology*
Verification
Design
Build
Test
Debugging
* Beware of sending synthetic biology grant proposals to a biology panel
42
Synthetic Biology
Synthetic Biology is Engineering,
i.e it is not biology*
Verification
Design
Build
Test
Debugging
* Beware of sending synthetic biology grant proposals to a biology panel
43
A Real Network (E. coli)
Host Context
Design/Construction
1.2
1.2
1
Relative Fluorescence
1
p3
0.8
Increased Repression
0.6
Simulation
0.4
Experimental Data
0.6
0.4
Increased
Repression
0.2
0.2
0
0.001
0.8
0.01
0.1
1
10
p1
Entus et al, Systems and Synthetic Biology, 2007.
0
0.001
0.01
0.1
1
10
100
1000
IPTG (mM)
44
http://www.agricorner.com/e-coli-outbreak-german-farm-in-uelzen-likely-source/
Synthetic Networks
Concentration Detector
Generic Design:
If we control the level of feed-forward
Inhibition we can tune the circuit:
1.2
1
p3
0.8
0.6
0.4
0.2
0
0.001
0.01
0.1
p1
1
10
45
Synthetic Networks
Concentration Detector
Generic Design:
Input: IPTG
Output: GFP
1.2
Relative Fluorescence
1
0.8
0.6
0.4
0.2
0
0.001
0.01
0.1
1
IPTG (mM)
10
100
1000
46
CAD Software- Engineering Cycle
Simulation
Design
1.2
1
0.6
0.4
Fabrication
0.2
0
0.001
0.01
0.1
1
10
p1
Testing
1.2
1
0.8
0.6
0.4
0.2
0
0.001
0.01
0.1
1
10
100
1000
IPTG (mM)
1
0.8
Fluorescence
Relative Fluorescence
p3
0.8
0.6
0.4
0.2
0
0.001
0.01
0.1
1
10
100
1000
IPTG (mM)
47
Computational tools and information
resources support each step
TinkerCell CAD
iBioSim
Laboratory Information
GDice
Clotho
Analysis
Specification
BIOFAB
Build
ApE Sequence Editor
Design
GenoCAD
Public Data
48
Registry of Standard Biological Parts (BioBricks)
http://parts.mit.edu
 Provides free access to an open commons of basic biological functions that can be
used to program synthetic biological systems
 Anybody may contribute, draw upon, or improve the parts maintained within the
Registry.
Endy D, 2005. Nature 438: 449-45349
SBOL is extensible, allows us to form
community subgroups
type
Sample
cell
strain
UW002
dna
MG1655
type
pUW4510
Cell
Experimental
Measurements
Computational
Models
subClassOf
Plasmid
DNA
Physical and Host Context
B0015
annotatio
n
1-80
Visualization
featur
e
B0010
type
Terminator
Assembly
Methods
subClassO
f
annotatio
n
81-88
featur
e
BioBrick Scar
type
BioBrick Scar
subClassO
f
Sequence Feature
annotatio
n
89-129
type
featur
e
B0012
type
Sequence Annotation
SS002
Terminator
subClassO
f
Core
SBOL
50
TinkerCell: Project to explore the potential of
computer aided design in synthetic biology
First prototype
called Athena
developed
by Bergmann
and Chandran
51
Layered Architecture:
Based on C++/Qt
Octave,
52
Each component in the TinkerCell diagram is
associated with one or more tables
53
A TinkerCell model can be composed
of sub-models
54
A TinkerCell model can be composed
of sub-models
?
?
?
?
?
?
55
Availability
www.tinkercell.com (Windows, Mac and Linux, released under BSD)
Contact author for details ([email protected])
56
Challenges in building SBOL
• Gaining consensus in a growing community
– Identifying and engaging stakeholders
• Fast pace of in the field
– Terminology evolution
• “BioBricks”  “Parts”  “DNA components”
– Stability of use cases
• “Standard” and “Research needs” seem contradictory
– Software for synthetic biology is new
• Scarcity of data sources
– Quality “knowledge” about elements
– Heterogeneity of existing annotations
• Funding
57
Who is the we?
University of Washington
Deepak Chandran
John Gennari
Michal Galdzicki
Herbert Sauro
University of California,
Berkeley
J. Christopher Anderson
Boston University
Douglas Densmore
Virginia Bioinformatics Institute
Laura Adam
Matthew Lux
Mandy Wilson
Jean Peccoud
University of Toronto
Raik Gruenberg
http://www.sbolstandard.org/
BIOFAB
Cesar Rodriguez
Akshay Maheshwari (now UCSD)
Drew Endy (Stanford)
Joint BioEnergy Institute
Timothy Ham
University of Utah
Barry Moore
Nicholas Roehner
Chris J. Myers
iBioSim
Imperial College of London
Guy-Bart Stan
Newcastle University (UK)
Aniel
Recent Commercial Interest
BBN, DNA 2.0, Agilent
Life Technologies, AutoDesk
58
Acknowledgements:
The People and the Support
Hamid Bolouri
Andrew Finney
Mike Hucka
Herbert Sauro
Frank Bergmann
Deepak Chandran
Vijay Chickarmane
Michal Galdzicki
Lucian Smith
Funding in chronological order(2000 -> 2011):
……
59
Textbook
Enzyme Kinetics for Systems Biology
•
•
•
•
•
Available as e-book or paperback on www.analogmachine.org &
318 pages, 94 illustrations and 75 exercises
E-book - $9.95
Paperback - $39.95
Author: H M Sauro
60

similar documents