Spatial Heterogeneity Modeling Using Machine Learning Based on a Hybrid of Random Forest and Convolutional Neural Network (CNN)

Barry, Amadou Kindy and Gichuhi, Anthony Waititu and Nderu, Lawrence (2024) Spatial Heterogeneity Modeling Using Machine Learning Based on a Hybrid of Random Forest and Convolutional Neural Network (CNN). Journal of Data Analysis and Information Processing, 12 (03). pp. 319-347. ISSN 2327-7211

[thumbnail of jdaip2024123_12870699.pdf] Text
jdaip2024123_12870699.pdf - Published Version

Download (1MB)

Abstract

Spatial heterogeneity refers to the variation or differences in characteristics or features across different locations or areas in space. Spatial data refers to information that explicitly or indirectly belongs to a particular geographic region or location, also known as geo-spatial data or geographic information. Focusing on spatial heterogeneity, we present a hybrid machine learning model combining two competitive algorithms: the Random Forest Regressor and CNN. The model is fine-tuned using cross validation for hyper-parameter adjustment and performance evaluation, ensuring robustness and generalization. Our approach integrates Global Moran’s I for examining global autocorrelation, and local Moran’s I for assessing local spatial autocorrelation in the residuals. To validate our approach, we implemented the hybrid model on a real-world dataset and compared its performance with that of the traditional machine learning models. Results indicate superior performance with an R-squared of 0.90, outperforming RF 0.84 and CNN 0.74. This study contributed to a detailed understanding of spatial variations in data considering the geographical information (Longitude & Latitude) present in the dataset. Our results, also assessed using the Root Mean Squared Error (RMSE), indicated that the hybrid yielded lower errors, showing a deviation of 53.65% from the RF model and 63.24% from the CNN model. Additionally, the global Moran’s I index was observed to be 0.10. This study underscores that the hybrid was able to predict correctly the house prices both in clusters and in dispersed areas.

Item Type: Article
Subjects: Eprint Open STM Press > Computer Science
Depositing User: Unnamed user with email admin@eprint.openstmpress.com
Date Deposited: 18 Jun 2024 12:42
Last Modified: 18 Jun 2024 12:42
URI: http://library.go4manusub.com/id/eprint/2211

Actions (login required)

View Item
View Item