### Nonlinear belief models - Optimal Learning

```Tutorial:
Optimal Learning in the Laboratory Sciences
Working with nonlinear belief models
December 10, 2014
Warren B. Powell
Kris Reyes
Si Chen
Princeton University
http://www.castlelab.princeton.edu
1
Slide Slide
1
Lecture outline
 Nonlinear belief
models
2
The knowledge gradient can be hard to compute:
x
KG ,n
 E  m ax y F ( y , K
The expectation can be
hard to compute when the
belief model is nonlinear.

n 1
( x ))  m ax y F ( y , K )
n
The belief model is often
nonlinear, such as the
kinetic model for fluid
dynamics.
This has motivated research into how to handle these
problems.
3
Proposal: Assume a finite number of truths (discrete
priors), e.g. L=3 possible candidate truths
Utility curve depends on kinetic parameters, e.g
 1,  2 ,  3
We maintain the weights of each of the possible
candidates to represent how likely it is the truth, e.g.
p1=p2=p3=1/3 means equally likely
4
Utility curve depends on
kinetic parameters.
The weights on the candidate truths are also on the choice of
kinetic parameters:
Estimation: a weighted sum of all candidate truths
There are many possible candidate truths
For each candidate truths, the measurements are noisy
Utility curve depends on
kinetic parameters.
Suppose we make a measurement
Weights are updated upon observation
Observation
More likely based on observation.
Less likely based on
observation
Estimate is then updated using our observation
Average Marginal of Information
Best estimate: maximum utility value
Marginal value of information
Average marginal value of information: average across
all candidate truths and noise
Best estimate
before the experiment
Best estimate
after the experiment
KGDP makes decisions by maximizing the average
marginal of information
After several observations, the weights can tell us
12
Candidate Truths (2D)
Beliefs on parameters produces family of surfaces
13
Before any measurements
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
5
6
7
8
9
Oil droplet diameter (nm)
Do we explore? The KG map shows
us where we learn the most.
10
Prior Estimate
Inner water droplet diameter (nm)
Inner water droplet diameter (nm)
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
5
6
7
8
9
Oil droplet diameter (nm)
… or do we exploit? This is the
region where we think we will get the
best results (but we might be wrong).
This is the classic exploration vs. exploitation problem
10
Before any measurements
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
5
6
7
8
9
Oil droplet diameter (nm)
Do we explore? The KG map shows
us where we learn the most.
10
Prior Estimate
Inner water droplet diameter (nm)
Inner water droplet diameter (nm)
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
5
6
7
8
9
Oil droplet diameter (nm)
… or do we exploit? This is the
region where we think we will get the
best results (but we might be wrong).
This is the classic exploration vs. exploitation problem
10
Before any measurements
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
5
6
7
8
9
Oil droplet diameter (nm)
10
Prior Estimate
Inner water droplet diameter (nm)
Inner water droplet diameter (nm)
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
5
6
7
8
9
Oil droplet diameter (nm)
10
After 1 measurement
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
5
6
7
8
9
Oil droplet diameter (nm)
10
Posterior Estimate
Inner water droplet diameter (nm)
Inner water droplet diameter (nm)
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
5
6
7
8
9
Oil droplet diameter (nm)
10
After 2 measurements
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
5
6
7
8
9
Oil droplet diameter (nm)
10
Posterior Estimate
Inner water droplet diameter (nm)
Inner water droplet diameter (nm)
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
5
6
7
8
9
Oil droplet diameter (nm)
10
After 5 measurements
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
5
6
7
8
9
Oil droplet diameter (nm)
10
Posterior Estimate
Inner water droplet diameter (nm)
Inner water droplet diameter (nm)
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
5
6
7
8
9
Oil droplet diameter (nm)
10
After 10 measurements
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
5
6
7
8
9
Oil droplet diameter (nm)
10
Posterior Estimate
Inner water droplet diameter (nm)
Inner water droplet diameter (nm)
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
5
6
7
8
9
Oil droplet diameter (nm)
10
After 20 measurements
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
5
6
7
8
9
Oil droplet diameter (nm)
10
Posterior Estimate
Inner water droplet diameter (nm)
Inner water droplet diameter (nm)
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
5
6
7
8
9
Oil droplet diameter (nm)
10
After 20 measurements
Posterior Estimate
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
5
6
7
8
Oil droplet diameter (nm)
9
10
Inner water droplet diameter (nm)
Inner water droplet diameter (nm)
Truth
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
5
6
7
8
9
Oil droplet diameter (nm)
10
Kinetic parameter estimation
Besides learning where optimal utility is, the KG
policy can help learn kinetic parameters.
Distribution on candidate truths induces a distribution
on their respective parameters.
Candidate Probability Probability
0.7
0.6
Uniform prior distribution
0.5
0.4distribution of possible parameter vectors…
Uniform
0.3
0.2
0.1
0
0
Parameter Probability
10
20
30
Candidate Truth
40
1
1
… translates to
random sample of a
0.5
uniform distribution
for an individual
parameter.
0.5
0
50
7
8
9
10
11
0
0.52 0.54 0.56 0.
Kinetic parameter estimation
0.5
Probability
0
7
8
9
10
11
0
1
1
0.5
0.5
0
26
28
30
0
1
0.5
0.5
Probability
0.5
1
0
0.52 0.54 0.56 0.58
0.82 0.84 0.86 0.88
Prior distribution
7
8
9
10
11
0
1
1
0.5
0.5
Probability
1
Probability
1
0
26
28
30
0
0.52 0.54 0.56 0.58
0.82 0.84 0.86 0.88
After 20 measurements
Kinetic parameter estimation
Probability
1
1
Low prefactor/low
barrier
0.5
0.5
0
7
8
9
10
11
0
0.52 0.54 0.56 0.58
High prefactor/high barrier
1
1
0.5
0.5
Probability
• Most probable prefactor/
energy barriers come in
pairs.
• Yield similar rates at room
temperature.
• KG is learning these rates.
0
26
28
30
0
0.82 0.84 0.86 0.88
After 20 measurements
Kinetic parameter estimation
After 50 measurements, distribution of belief about vectors…

… distribution of belief about k ripe
:
k ripe
After 50 measurements, distribution of belief about vectors…

… distribution of belief about one parameter:
k coalesce
28
Collaboration with McAlpine Group
Opportunity Cost
Percentage opportunity cost: difference between
estimated and true optimum value w.r.t the true
optimum value
29
Rate Error
Rate error (log-scale): difference between the
estimated rate and the true optimal rate
30
```