Report

CrimeStat GROUP 10 NORINE WILCZEK & BRAD JOHNSTON CSCI 5980 DECEMBER 4, 2012 Organization • About CrimeStat • CrimeStat analysis tools • Problem & importance • Data • Challenges • Tools & methods used • Processes & map outputs • Limitations • Contributions to computers and society About CrimeStat • Analysis tool used in • • • Law enforcement Public health The environment • Free Download from National Institute of Justice • http://www.icpsr.umich.edu/CrimeStat/ CrimeStat Continued • Spatial statistics program • Windows based • Purpose: provide supplemental statistical tools to aid law enforcement agencies and criminal justice researchers in crime mapping • Uses GIS shapefiles to perform object-based analysis • Primary file • Incident locations with X,Y coordinate system • Secondary file • for comparison • Reference file • Grid overlay for measurement, used for model interaction of 2 points CrimeStat Analysis Options • Spatial Description • Distance Analysis • • • Spatial Autocorrelation • • Nearest neighbor, linear nearest neighbor, or Ripley’s K statistic distance between incidents Calculates distance between incidents from 2 files and places on a grid Stats for describing amount of spatial autocorrelation between incidents Spatial Distribution • • Mean center Center of minimum distance • Hotspot Analysis • • • Mode, fuzzy mode, hierarchical nearest neighbor clustering Risk-adjusted nearest neighbor hierarchical clustering -> ellipses or convex hull output Spatial and Temporal Analysis of Crime (STAC), Kmeans cluster, Anselin’s local Moran, Getis-Ord local G statistics -> ellipses or convex hulls More CrimeStat Analysis Options • Spatial Modeling • Regression modeling • Crime Travel Demand Models • • Analyzes relationship between a dependent variable and one or more independent variables • Journey to Crime • Predict number of crimes in each zone (origins) and (destinations) • • Space-time Analysis each zone to every other zone using gravity model • zone to zone using function that approximates one mode relative to other modes. (serial offender data) Interpolation • Single variable kernel density • dual-variable kernel density (comparing to baseline) Mode Split • Split predicted number of trips for • Clustering in time and space • Trip Distribution • 2nd stage, distributes trips from • Serial offender data – likely location based on distribution of incidents and travel behavior Trip Generation • Network Assignment • Shortest path algorithm predicts trips from each zone to other zone (likely path). Requires travel network (transits & one way streets, roads, etc) Problem & Importance • Problem • • • Crime occurs globally Statistical analysis is necessary Patterns, trends, high crime areas, potential re-offending predictions • Importance • • Response Prevention • • • • • Crime Injuries Death Utilize resources Mitigation of economic losses • Lost/Recovered property Data • University of Minnesota Police Department • • • 9/2011 – 9/2012 September 2011 • (all crimes) Theft from building • • (9/11 – 9/12) Bicycle thefts Challenges • Data • • Process through a GIS View results with a GIS • .shp, .dbf (uses and produces shapefiles, not feature classes) • Clean up received data • • Time/Date field City, state, zip field → Google • Proper geo-coding in ArcMap Tools & Methods Used • Spatial Distribution Tool • Distance Analysis Tool • Hotspot Analysis Spatial Distribution Tool • Mean & median center, center of minimum distance • Standard deviation • Half of crimes in a cluster will be within one standard deviation ellipse of the mean center, around 90% will be found within two standard deviation ellipses • Forecasting: identifying where crime is likely to occur Result Map Result: Map Distance Analysis Tool • Distance Analysis 1 • Point pattern of clustering and dispersion • • Distances between the points and reference locations as indicator (distance based tests) Number of points in a given area for basis of test statistics • • If distance is smaller than what it would be under complete spatial randomness, it suggests clustering If distance tends to be larger, then it suggests dispersion Result: Chart Nearest neighbor analysis: -------------------------Sample size........: 26 107.64 sq ft Measurement type...: Direct 0.00000 sq mi Start time.........: 03:49:10 PM, 11/05/2012 Mean Random Distance ............: 0.31 m, 1.02 ft, 0.00019 mi Mean Nearest Neighbor Distance ..: 109.91 m, 360.58 ft, Mean Dispersed Distance .........: 0.67 m, 2.19 ft, 0.00041 mi 0.06829 mi Nearest Neighbor Index ..........: 354.4364 Standard Dev of Nearest Neighbor Distance ...............: 183.07 m, 600.61 ft, 0.11375 mi Minimum Distance ................: 0.00 m, 0.00 ft, 0.00000 mi Maximum Distance ................: 2611.89 m, 8569.19 ft, 1.62295 mi Standard Error ..................: 0.03 m, 0.10 ft, 0.00002 mi Test Statistic (Z) ..............: 3447.6946 p-value (one tail) ..............: 0.0001 p-value (two tail) ..............: 0.0001 Mean Nearest Based on Bounding Rectangle: Area ............................: 3808465.65 sq m 40993983.06 sq ft 1.47046 sq mi Mean Random Distance ............: 191.36 m, 627.83 ft, 0.11891 mi Mean Dispersed Distance .........: 411.25 m, 1349.25 ft, 0.25554 mi Nearest Neighbor Index ..........: 0.5743 Standard Error ..................: 19.62 m, 64.36 ft, 0.01219 mi Test Statistic (Z) ..............: -4.1523 p-value (one tail) ..............: 0.0001 p-value (two tail) ..............: 0.0001 Based on User Input Area: Area ............................: 10.00 sq m Order Index Expected Nearest Nearest Neighbor Distance (m) Neighbor Distance (m) Neighbor ***** ********************* ********************* ************** 1 109.9061 0.3101 354.43636 .5743 Cluster Result Map Bike Thefts 2011-2012 Hotspot Analysis • • Hotspot: dense area of incidents, in this case a spatial concentration of crime "Geographic area representing a small percentage of the study area which contains a high percentage of the studied phenomena" • Spatial Description • Hotspot Analysis I • Fuzzy Mode • identifies the geographic coordinates, plus a userspecified surrounding radius, with the highest number of incidents • Nearest neighbor Hierarchical Spatial Clustering (Nnh) • • Interpolation method Minimum points per cluster • Results: • • • 7 NNH clusters 10 or more Building Thefts within 1500 sq. meter area Calculates mean X,Y of ellipses in the output table Mode vs. Fuzzy Mode Mode Fuzzy Mode • • Hotspot Analysis Result Map • • Theft from Buildings • 2011-2012 Kernel Density Estimation • Most popular type of map in • • • • • crime analysis Generalized over larger areas (compared to Hotspot) Interpolation method Creates “risk areas” Kernel size and weight determined by user, smoothed (linear relationship) throughout kernel Multiple points at one location, kernels aggregate to total in grid cell Kernel Density Estimation Analysis Result Map Theft from Buildings 2011-2012 Nearest Neighbor Hiearchical & Kernel Density Estimation *Nearest neighbor clusters and kernel density estimation analysis overlay *Mondale Hall & Carlson *Coffman Mem Union *Walter, Appleby, & Johnston Hall area *Rec Center Limitations • CrimeStat uses data with latitude and longitude • • • Does not pick up “on the fly” Spatial references need to match Our data missing X,Y column • Add XY Data tool in ArcMap • Experimented with and added in XY data for use in CrimeStat • Size of geographic region • • CrimeStat useful for larger areas (than U of M campuses) Clusters would show up in a city or regional level where areas have crime that is less likely to occur (including stats of bike ridership, socioeconomic conditions) Contributions to Computer & Society • Analysis tool for Law Enforcement, Public Health, and the Environment • Visual analysis vs. statistical analysis • Benefits of CrimeStat: • Calculates spatial statistics, which can calculate correlations between geographic variables and detect subtle changes in geography of a pattern over time that they eyes do not see • Law enforcement resource allocation • Faster response time • Citizen awareness End of Presentation Questions ?