Pattern Analysis | GIS 520 Portfolio

Analyzing Alarm Clustering Patterns- Spatial Pattern Analysis

Problem and Objective

Tobler’s first law of geography is based on near things are more alike than things that are far apart. So, what is the chance that the distribution of things is occurring due to random chance? Spatial pattern analysis and statistics describes and models the spatial distribution, patterns, processes, and spatial relationships between features. Spatial statistics uses area, length, proximity, orientation and spatial relationships to quantify the data on the map. Although spatial statistics does not identify where the clustering or pattern is occurring, it does answer the following questions: Is there spatial clustering? How intense is the clustering? Using spatial statistics and a statistical confidence level for pattern analysis, the null hypothesis can be accepted or rejected. For pattern-analysis tools, the null hypothesis states that there is no pattern. The objective of this exercise was to use global calculations to assess spatial patterns and determine and characterize clustering. For example, the Fort Worth Fire Department is investigating if false alarms for Battalion 2 are exhibiting any clustering. If so, the department will target its safety campaign to those areas, explaining how to recognize a real emergency and avoid false alarms. The Fire Department also wants to know if calls ranked as a high priority are clustering and if there is significant clustering of high priority calls. Lastly, the Oleander Library is interested in the density of patrons per block and the distance at which they cluster. In this exercise, I learned to use global calculation tools and gained an understanding of spatial pattern analysis methods.

Analysis Procedure

The Spatial statistics toolbox can be used to In order to assess the spatial patterns and clustering of features. For this exercise, I used ArcGIS Pro 2.8 and the Analyzing Patterns toolset to analyze the pattern of false alarms and library patrons. Data was provided from “GIS Tutorial 2: Spatial Analysis Workbook”. To determine if false alarms showed statistically significant clustering, I used the the Average Nearest Neighbor tool, the Calculate Distance Band from Neighbor Count tool, Getis-Ord General G tool, the Multi-Distance Spatial Cluster Analysis tool, the Spatial Autocorrelation Moran’s I tool and I created a line chart of the difference between clustering.

First, I used the Average Nearest Neighbor tool to analyze the fire department data for clustering after making a query to select for false alarm incidents. The tool calculates the neighbor index based on the average distance from each feature to its nearest neighboring feature and produces an observed mean distance, expected mean distance, nearest neighbor ratio, Z-score, and confidence value. I used the Z-score and confidence value to interpret the results.

Next, to look at the effect of feature values (the ranking of high priority calls) on clustering, I used the Calculate Distance Band from Neighbor Count tool, which determines the minimum, average, and maximum distances at which each feature can find at least seven neighbors. This was used to find the values for the High/Low Clustering tool. I ran the General G clustering tool several times with a range of distances until I found the highest z-score. This corresponded to the strongest clustering of values (high priority calls) and I was able to interpret my results.

Then, I did a Multi Distance spatial cluster analysis to compare the count of neighboring features from several distances to a random distribution using permutations in order to determine if high priority calls were randomly distributed across Battalion 2. I used the Multi-Distance Spatial Cluster Analysis tool and created a confidence envelope of 10 bands. The confidence envelope provides a baseline comparison for the levels of clustering and at what distance the most significant clustering occurs. So, I computed the difference between the highest and lowest values to show the greatest margin and identified the values associated with significant clustering.

Lastly, I looked at the density of calls per block to determine if library patrons are randomly distributed. I created a density grid by overlaying a grid and counting the features within by completing a spatial join. Then, I used the Spatial Autocorrelation Moran’s I tool to look for clustering of the density values. This tool compares the values of neighboring features between each pair of neighbors and all the other features in the study area (grid) and computes a Z-score and confidence value to measure random clustering. I ran the Spatial Autocorrelation tool for the new grid layer using a range of test distances to determine which showed the most significant results.

Results

Nearest Neighbor Index Analysis of False Alarm Calls for Battalion 2 in February 2015

General G Analysis for Priority Calls for Battalion 2 in January 2015

Multi-Distance Spatial Cluster Analysis of Calls of Service for Battalion 2 in January 2015

Density of Library Patrons per Block

Application and Reflection

The Spatial Statistics toolbox and the Analyzing Patterns toolset gives GIS users the ability to study patterns in a dataset in order to assess the probability that the distribution of features occurred at random chance. With spatial statistics, GIS users can determine mathematically if something will or not not happen. For instance, the null hypothesis states that there is no pattern and the Z-score and associated confidence level can be used as a measure to determine statically significant clustering. A very high or a very low z-score indicates that it is very unlikely that the observed pattern is the product of a random distribution. In this assignment, the toolbox was used to spatially analyze clustering of high priority false alarm calls at the Fire Department. I was able to answer questions such as: what is the probability that the distribution of these features is occurring due to random chance? And, are the features in the dataset, or the values associated with the features in the dataset, spatially clustered? This toolset can be used outside of call clustering and can be seen in the epidemiology of disease outbreaks.

Problem Description: During a suspected foodborne outbreak, epidemiologists could ask what is the probability that the cases are occurring due to random chance? They could look to see if the clusters of foodborne illness, such as Salmonella or E.coli, are significantly clustering or displaying a random pattern. Perhaps a foodborne outbreak is occurring in a similar geographic region.
Data Needed: In North Carolina and across other U.S. states, communicable diseases like Salmonella and E.coli are reportable conditions. Physicians are required to report these cases to state health departments and/or the Centers for Disease Control (CDC ) for case surveillance. In order for public health officials to assess if the reported cases are clustering, the addresses of the ill individuals would need to be geocoded and displayed on the map.
Analysis Procedures: Using the geocoded case information and the Average Nearest Neighbor tool, epidemiologists could determine if cases are significantly clustered. Using the Calculate Distance Band from Neighbor Count tool and the High/Low Clustering tool, they could determine if there are associations between similar features like occupation. To assess if there is a possibility that the cases are concentrated in an area, such as around where the corporate office is located, they could use the Multi-Distance Spatial Cluster Analysis tool to take into account a specific study area. Lastly, they could create a density grid and use Global Moran’s I tool to assess the density of cases in a specific grid.