Using mixed-effect random forest models to capture spatial patterns: a case study on urban crime. GeoComputation 2019 (original) (raw)
Related papers
A crime prediction model based on spatial and temporal data
Periodicals of Engineering and Natural Sciences (PEN), 2018
In a world where data has become precious thanks to what we can do with it like forecasting the future, the fight against crime can also benefit from this technological trend. In this work, we propose a crime prediction model based on historical data that we prepare and transform into spatiotemporal data by crime type, for use in machine learning algorithms and then predict, with maximum accuracy, the risk of having crimes in a spatiotemporal point in the city. And in order to have a general model not related to a specific type of crime, we have described our risk by a vector of n values that represent the risks by type of crime.
2020
Crime prediction in urban areas can improve the allocation of resources (e.g., police patrols) towards a safer society. Recently, researchers have been using deep learning frameworks for urban crime forecasting with better accuracies as compared to previous work. However, these studies typically partition a metropolitan area into synthetic regions, e.g., grids, which neglects the geographical semantics of a region, nor captures the spatial correlation across the regions, e.g., precincts, neighborhoods, blocks, and postal division. In this paper, we design and implement an end-to-end spatiotemporal deep learning framework, dubbed CrimeForecaster, which captures both the temporal recurrence and the spatial dependency simultaneously within and across regions. We model temporal dependencies by using the Gated Recurrent Network with Diffusion Convolution modules to capture the cross-region dependencies at the same time. Empirical experiments on two real-world datasets showcase the effect...
Predicting Spatial Crime Occurrences through an Efficient Ensemble-Learning Model
Preprints, 2020
While the use of crime data has been widely advocated in the literature, its availability is often limited to large urban cities and isolated databases that tend not to allow for spatial comparisons. This paper presents an efficient machine learning framework capable of predicting spatial crime occurrences, without using past crime as a predictor, and at a relatively high resolution: the U.S. Census Block Group level. The proposed framework is based on an in-depth multidisciplinary literature review allowing the selection of 188 best-fit crime predictors from socioeconomic , demographic, spatial, and environmental data. Such data are published periodically for the entire United States. The selection of the appropriate predictive model was made through a comparative study of different machine learning families of algorithms, including generalized linear models, deep learning, and ensemble learning. The gradient boosting model was found to yield the most accurate predictions for violent crimes, property crimes, motor vehicle thefts, vandalism, and the total count of crimes. Extensive experiments on real-world datasets of crimes reported in 11 U.S. cities demonstrated that the proposed framework achieves an accuracy of 73% and 77% when predicting property crimes and violent crimes, respectively.
A systematic review on spatial crime forecasting
Crime Science, 2020
Background: Predictive policing and crime analytics with a spatiotemporal focus get increasing attention among a variety of scientific communities and are already being implemented as effective policing tools. The goal of this paper is to provide an overview and evaluation of the state of the art in spatial crime forecasting focusing on study design and technical aspects. Methods: We follow the PRISMA guidelines for reporting this systematic literature review and we analyse 32 papers from 2000 to 2018 that were selected from 786 papers that entered the screening phase and a total of 193 papers that went through the eligibility phase. The eligibility phase included several criteria that were grouped into: (a) the publication type, (b) relevance to research scope, and (c) study characteristics. Results: The most predominant type of forecasting inference is the hotspots (i.e. binary classification) method. Traditional machine learning methods were mostly used, but also kernel density estimation based approaches, and less frequently point process and deep learning approaches. The top measures of evaluation performance are the Prediction Accuracy, followed by the Prediction Accuracy Index, and the F1-Score. Finally, the most common validation approach was the train-test split while other approaches include the cross-validation, the leave one out, and the rolling horizon. Limitations: Current studies often lack a clear reporting of study experiments, feature engineering procedures, and are using inconsistent terminology to address similar problems. Conclusions: There is a remarkable growth in spatial crime forecasting studies as a result of interdisciplinary technical work done by scholars of various backgrounds. These studies address the societal need to understand and combat crime as well as the law enforcement interest in almost real-time prediction. Implications: Although we identified several opportunities and strengths there are also some weaknesses and threats for which we provide suggestions. Future studies should not neglect the juxtaposition of (existing) algorithms, of which the number is constantly increasing (we enlisted 66). To allow comparison and reproducibility of studies we outline the need for a protocol or standardization of spatial forecasting approaches and suggest the reporting of a study's key data items.
Predicting Crime Using Spatial Features
Advances in Artificial Intelligence, 2018
Our study aims to build a machine learning model for crime prediction using geospatial features for different categories of crime. The reverse geocoding technique is applied to retrieve open street map (OSM) spatial data. This study also proposes finding hotpoints extracted from crime hotspots area found by Hierarchical Density-Based Spatial Clustering of Applications with Noise (HDBSCAN). A spatial distance feature is then computed based on the position of different hotpoints for various types of crime and this value is used as a feature for classifiers. We test the engineered features in crime data from Royal Canadian Mounted Police of Halifax, NS. We observed a significant performance improvement in crime prediction using the new generated spatial features.
Crime Prediction Using Spatio-Temporal Data
2020
A crime is a punishable offence that is harmful for an individual and his society. It is obvious to comprehend the patterns of criminal activity to prevent them. Research can help society to prevent and solve crime activates. Study shows that only 10 percent offenders commits 50 percent of the total offences. The enforcement team can respond faster if they have early information and pre-knowledge about crime activities of the different points of a city. In this paper, supervised learning technique is used to predict crimes with better accuracy. The proposed system predicts crimes by analyzing data-set that contains records of previously committed crimes and their patterns. The system stands on two main algorithms - i) decision tree, and ii) k-nearest neighbor. Random Forest algorithm and Adaboost are used to increase the accuracy of the prediction. Finally, oversampling is used for better accuracy. The proposed system is feed with a criminal-activity data set of twelve years of San ...
Machine Learning Algorithms for Visualization and Prediction Modeling of Boston Crime Data
2020
Machine learning plays a key role in present day crime detection, analysis and prediction. The goal of this work is to propose methods for predicting crimes classified into different categories of severity. We implemented visualization and analysis of crime data statistics in recent years in the city of Boston. We then carried out a comparative study between two supervised learning algorithms, which are decision tree and random forest based on the accuracy and processing time of the models to make predictions using geographical and temporal information provided by splitting the data into training and test sets. The result shows that random forest as expected gives a better result by 1.54% more accuracy in comparison to decision tree, although this comes at a cost of at least 4.37 times the time consumed in processing. The study opens doors to application of similar supervised methods in crime data analytics and other fields of data science
ISPRS International Journal of Geo-Information
The aim of this paper is to present developments of an advanced geospatial analytics algorithm that improves the prediction power of a random forest regression model while addressing the issue of spatial dependence commonly found in geographical data. We applied the methodology to a simple model of mean household income in the European Union regions to allow easy understanding and reproducibility of the analysis. The results are encouraging and suggest an improvement in the prediction power compared to previous techniques. The algorithm has been implemented in R and is available in the updated version of the SpatialML package in the CRAN repository.
To be or not to be? A spatial predictive crime model for Rochester
2020
This project uses a spatial model (Geographically Weighted Regression) to relate various physical and social features to crime rates. Besides making interesting predictions from basic data statistics, the trained model can be used to predict on the test data. The high accuracy of this prediction on test data then allows us to make predictions of crime probabilities in different areas based on the location, the population, the property rate, the time of the day/year and so on. This then further gives us the idea that an application can be built to help people traveling around Rochester be aware when and if they enter crime prone area.
A multi‐dimensional crime spatial pattern analysis and prediction model based on classification
ETRI Journal
This article presents a multi-dimensional spatial pattern analysis of crime events in San Francisco. Our analysis includes the impact of spatial resolution on hotspot identification, temporal effects in crime spatial patterns, and relationships between various crime categories. In this work, crime prediction is viewed as a classification problem. When predictions for a particular category are made, a binary classification-based model is framed, and when all categories are considered for analysis, a multiclass model is formulated. The proposed crime-prediction model (HotBlock) utilizes spatiotemporal analysis for predicting crime in a fixed spatial region over a period of time. It is robust under variation of model parameters. HotBlock's results are compared with baseline real-world crime datasets. It is found that the proposed model outperforms the standard DeepCrime model in most cases.