R for Macroecology Lecture 5

Report
R for Macroecology
Spatial data continued
Projections

Cylindrical projections
Lambert CEA
Behrmann EA

Latitude of true scale = 30
Choosing a projection

What properties are important?





Angles (conformal)
Area (equal area)
Distance from a point (equidistant)
Directions should be strait lines (gnomonic)
Minimize distortion

Cylindrical, conic, azimuthal
http://www.geo.hunter.cuny.edu/~jochen/gtech201/lectures/lec6concepts/map%20coordinate%20systems/how%20to%20choose%20a%20projection.htm
Projecting points



project() function in the proj4 package is good
Can also use project() from rgdal
spTransform() (in rgdal) works for SpatialPoints,
SpatialLines, SpatialPolygons . . .

Can also handle transformations from one datum to another
Projecting points
>
>
>
>
>
>
lat = rep(seq(-90,90,by = 5),(72+1))
long = rep(seq(-180,180,by = 5),each = (36+1))
xy = project(cbind(long,lat),"+proj=cea +datum=WGS84 +lat_ts=30")
par(mfrow = c(1,2))
plot(long,lat)
plot(xy)
Projecting a grid
> mat = raster("MAT.tif")
> mat = aggregate(mat,10)
> bea = projectExtent(mat,"+proj=cea +datum=WGS84 +lat_ts=30")
> mat
class
: RasterLayer
dimensions : 289, 288, 83232 (nrow, ncol, ncell)
resolution : 0.04166667, 0.04166667 (x, y)
extent
: 0, 12, 47.96667, 60.00833 (xmin, xmax, ymin, ymax)
projection : +proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs
+towgs84=0,0,0
values
: in memory
min value
: -22.88
max value
: 113.56
> bea
class
: RasterLayer
dimensions : 289, 288, 83232 (nrow, ncol, ncell)
resolution : 4016.896, 3137.077 (x, y)
extent
: 0, 1156866, 5450663, 6357279 (xmin, xmax, ymin, ymax)
projection : +proj=cea +datum=WGS84 +lat_ts=30 +ellps=WGS84
+towgs84=0,0,0
values
: none
Projecting a grid
> bea = projectExtent(mat,"+proj=cea +datum=WGS84 +lat_ts=30")
> res(bea) = xres(bea)
> matBEA = projectRaster(mat,bea)
> mat
class
: RasterLayer
dimensions : 289, 288, 83232 (nrow, ncol, ncell)
resolution : 0.04166667, 0.04166667 (x, y)
extent
: 0, 12, 47.96667, 60.00833 (xmin, xmax, ymin, ymax)
projection : +proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs +towgs84=0,0,0
values
: in memory
min value
: -22.88
max value
: 113.56
> matBEA
class
dimensions
resolution
extent
projection
values
min value
max value
:
:
:
:
:
:
:
:
RasterLayer
169, 288, 48672 (nrow, ncol, ncell)
4638.312, 4638.312 (x, y)
0, 1335834, 4721690, 5505565 (xmin, xmax, ymin, ymax)
+proj=cea +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0 +lat_ts=30
in memory
-21.65266
113.3013
How does it look?
What happened?
x = xFromCell(bea,1:ncell(bea))
y = yFromCell(bea,1:ncell(bea))
plot(x,y,pch = ".")
xyLL = project(cbind(x,y),
"+proj=cea +datum=WGS84
+latts=30”,inverse = T)
plot(xyLL,pch = ".")
What happened

Grid of points in lat-long (where each point corresponds
with a BEA grid cell)
Sample original raster at those points (with interpolation)
Different spacing in
y direction

Identical spacing in x direction
What are the units?
> matBEA
class
: RasterLayer
dimensions : 169, 288, 48672 (nrow, ncol, ncell)
resolution : 4638.312, 4638.312 (x, y)
extent
: 0, 1335834, 4721690, 5505565 (xmin, xmax, ymin, ymax)
projection : +proj=cea +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0
+lat_ts=30
values
: in memory
min value
: -21.65266
max value
: 113.3013
Meters, along the latitude of true scale (30N and 30S)
A break to try things out

Projections
Spatially structured data

Tobler’s first law of geography

“Everything is related to everything else, but near things are
more related than distant things.”

Waldo Tobler

Nearby data points often have similar conditions

Understanding these patterns can provide insights into
the data, and are critical for statistical tests
Visualizing spatial structure

The correlogram

Often based on Moran’s I, a measure of spatial correlation
Positive
autocorrelation
Negative
autocorrelation
Making Correlograms

First goal – get x, y and z vectors
x = xFromCell(mat,1:ncell(mat))
y = yFromCell(mat,1:ncell(mat))
z = getValues(mat)
assignColors = function(z)
{
z = (z-min(z,na.rm=T))/(max(z,na.rm=T)-min(z,na.rm=T))
color = rep(NA,length(z))
index = which(!is.na(z))
color[index] = rgb(z[index],0,(1-z[index]))
return(color)
}
plot(x,y,col = assignColors(z),pch = 15,
cex = 0.2)
Pairwise distance matrix

Making a correlogram starts with a pairwise distance
matrix!


(N data points requires ~ N2 calculations)
Big data sets need to be subsetted down
> co = correlog(x,y,z,increment = 10,resamp = 0, latlon = T,na.rm=T)
Error in outer(zscal, zscal) : allocMatrix: too many elements
specified
Pairwise distance matrix

Making a correlogram starts with a pairwise distance
matrix!


(N data points requires ~ N2 calculations)
Big data sets need to be subsetted down
> co = correlog(x,y,z,increment = 100,resamp = 0, latlon =
T,na.rm=T)
Error in outer(zscal, zscal) : allocMatrix: too many elements
specified
> length(x)
[1] 83232
> length(x)^2
[1] 6927565824

sample() can help us do this
Quick comments on sample()

sample() takes a vector and draws elements out of it
> v = c("a","c","f","g","r")
> sample(v,3)
[1] "r" "f" "c"
> sample(v,3,replace = T)
[1] "c" "a" "a"
> sample(v,6,replace = F)
Error in sample(v, 6, replace = F) :
cannot take a sample larger than the population when
'replace = FALSE‘
> sample(1:20,20)
[1] 12 14 2 8 6 5 10 4 7 9 18 16 1 17 19 15 20 3 13 11
Sampling

What’s wrong with this?

co = correlog(
x[sample(1:length(x),1000,replace = F)],
y[sample(1:length(y),1000,replace = F)],
z[sample(1:length(z),1000,replace = F)],
increment = 10, resamp = 0, latlon =
T,na.rm=T)
Sampling, the right way
> index = sample(1:length(x),1000,replace = F)
> co = correlog(x[index],y[index],z[index],increment = 100,resamp =
0, latlon = T,na.rm=T)
> plot(co)
Autocorrelation significance

Assessed through random permutation tests




Reassign z values randomly to x and y coordinates
Calculate correlogram
Repeat many times
Does the observed correlogram differ from random?
> index = sample(1:length(x),1000,replace = F)
> co = correlog(x[index],y[index],z[index],increment = 100,resamp =
100, latlon = T,na.rm=T)
> plot(co)
Downside – this is slow!
Significant deviation from random
Why is spatial dependence important?

Classic statistical tests assume that data points are
independent



Can bias parameter estimates
Leads to incorrect P value estimates
A couple of methods to deal with this (next week!)


Simultaneous autoregressive models
Spatial filters
Just for fun – 3d surfaces



R can render cool 3d surfaces
Demonstrate live
rgl.surface() (package rgl)

similar documents