boston14_canner

Report
bipolate: A Stata command for
bivariate interpolation with particular
application to 3D graphing
Joseph Canner, MHS
Xuan Hui, MD ScM
Eric Schneider, PhD
Johns Hopkins University
Stata Conference
Boston, MA
August 1, 2014
Background
• Educational outcome study
– Continuous outcome
– Two categorical (1-5) predictors
• Panel data
– 8 time periods
– 285 observations per time period
• Researcher desired 3D plot of outcome
versus two predictors
5
Solution #1: contour
160
150
140
4
130
EduNEW
COMPSS
120
110
100
90
3
80
70
60
50
2
40
1
2
3
CommunicationNEW
4
Solution #2: surface*
. surface CommunicationNEW
EduNEW COMPSS, xlabel(1/5)
ylabel(1/5)
* by Adrian Mander, available from SSC
Result
COMPSS
148.00
5
4
94.00
3
2
40.00
1
1
2
3
4
CommunicationNEW
5
EduNEW
collapse first
. collapse (mean)
mCOMPSS=COMPSS, by(EduNEW
CommunicationNEW)
. surface CommunicationNEW
EduNEW mCOMPSS, xlabel(1/5)
ylabel(1/5)
Result
(mean) COMPSS
112.82
5
4
78.41
3
2
44.00
1
1
2
3
4
CommunicationNEW
5
EduNEW
SAS Solution
proc g3grid data=a out=b ;
grid EduNEW*CommNEW=mCOMPSS /
axis1=1 to 5 by 0.1
axis2=1 to 5 by 0.1;
run;
SAS Solution (cont’d)
proc g3d data=b;
plot EduNEW*CommNEW=mCOMPSS /
xticknum=5 yticknum=5 grid;
run;
SAS Result
Stata Conference 2013
Wishes and grumbles session: no plans
to implement 3D graphing
SAS PROC G3GRID
• Interpolation options:
– <default>: biquintic polynomial
• PARTIAL: use splines for derivatives
• NEAR=n: number of nearest neighbors
(default=3)
– SPLINE: bivariate spline
• SMOOTH=numlist: smoothed spline
– JOIN: linear interpolation
Bivariate interpolation
• SAS G3GRID default
• Akima, Hiroshi (1978), "A Method of Bivariate
Interpolation and Smooth Surface Fitting for
Irregularly Distributed Data Points," ACM
Transaction on Mathematical Software, 4,
148-159.
• Fortran 77 originally in the NCAR library;
Fortran 77 and Fortran 90 versions freely
available on the web
History
“The original version of BIVAR was written by Hiroshi
Akima in August 1975 and rewritten by him in late 1976.
It was incorporated into NCAR's public software libraries
in January 1977. In August 1984 a new version of
BIVAR, incorporating changes described in the Rocky
Mountain Journal of Mathematics article cited below,
was obtained from Dr. Akima by Michael Pernice of
NCAR's Scientific Computing Division, who evaluated it
and made it available in February, 1985.”
Ref: Hiroshi Akima, On Estimating Partial Derivatives for Bivariate
Interpolation of Scattered Data, Rocky Mountain Journal of
Mathematics, Volume 14, Number 1, Winter 1984.
Algorithm summary
• XY plane divided into triangular cells
• Bivariate quintic polynomial in X and Y
fitted to each triangular cell
• Coefficients are determined by
continuity requirements and by
estimates of partial derivatives at the
vertices and along triangle edges
Algorithm features
• Invariant to certain transformations:
– Rotation of XY coordinate system
– Linear scale transformation of the Z axis
– Tilting of the XY plane
Algorithm features (cont’d)
• Interpolating function and first-order
partial derivatives are continuous
• Local method: change in data in one
area does not effect the interpolating
function in another area
• Gives exact results when all points lie in
a plane
bipolate command
Syntax: bipolate xvar yvar zvar
[if] [in] [using] [, options]
bipolate options
• method: interpolation or filling
• xgrid, ygrid: specify x-axis and y-axis
values to use for interpolation
• fillusing: specify data set to use for
filling
• collapse: how to handle multiple values
of z at a given x and y
• saving: save the resulting data to set to
disk
Use of bipolate
. bipolate CommunicationNEW
EduNEW COMPSS, xgrid(1(0.1)5)
ygrid(1(0.1)5) method(interp)
saving(test_bip)
. use test_bip, clear
. surface EduNEW
CommunicationNEW COMPSS_mod
Result
mean of COMPSS
104.64
5.00
68.45
3.00
CommunicationNEW
32.27
1.00
1.00
3.00
EduNEW
5.00
SAS Result
Remaining puzzles
• Why are there small differences
between interpolated values?
– SAS: “This default method is a modification
of that described by Akima (1978)”
• Re-orienting axis
surface …, … xscale(reverse)
5
4
68.45
3
CommunicationNEW
2
1
5
4
3
EduNEW
2
32.27
1
mean of COMPSS
104.64
5
bipolate+contour
160
150
130
110
100
90
80
70
2
3
mean of COMPSS
120
60
50
40
1
EduNEW
4
140
1
2
3
CommunicationNEW
4
5
Future plans
• Make available on SSC within a few
weeks
• Test other data sets
• Testing and debugging by Stata
community
Cobar Mine Data
26.00
z
7.00
18.50
-32.50
11.00
-16.00
-72.00
34.00
t1
84.00
t2
Cobar Mine Data
method(interp)
27.17
z_none
7.00
-11.22
-32.50
-49.61
-16.00
-72.00
34.00
t1
84.00
t2
Cobar Mine Data
method(fill)
(mean) z
27.17
7.00
13.92
-32.50
0.68
-16.00
-72.00
34.00
t1
84.00
t2
Cobar Mine Data
-80
-60
-40t2
-20
0
26
25
24
23
22
21
20
19
18
17
16
15
14
13
12
11
-20
0
20
40
t1
60
80
z
twoway contour
Cobar Mine Data
-80
-60
t2
-40
(mean) z
-20
0
bipolate+twoway contour
-20
0
20
40
t1
60
80
28
27
26
25
24
23
22
21
20
19
18
17
16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
0
Possible Future Enhancements
• Implement partial and near options
• Implement scale/noscale option
• Implement spline and smooth spline
interpolation
• See if Mata has functions that can
reproduce the algorithm more
compactly

similar documents