Fit random forest spatial residual models for point-referenced data (i.e., geostatistical models) using random forest to fit the mean and a spatial linear model to fit the residuals. The spatial linear model fit to the residuals can incorporate variety of estimation methods, allowing for random effects, anisotropy, partition factors, and big data methods.
Arguments
- formula
A two-sided linear formula describing the fixed effect structure of the model, with the response to the left of the
~
operator and the terms on the right, separated by+
operators.- data
A data frame or
sf
object object that contains the variables infixed
,random
, andpartition_factor
as well as geographical information. If ansf
object is provided withPOINT
geometries, the x-coordinates and y-coordinates are used directly. If ansf
object is provided withPOLYGON
geometries, the x-coordinates and y-coordinates are taken as the centroids of each polygon.- ...
Additional named arguments to
ranger::ranger()
orsplm()
.
Value
A list with several elements to be used with predict()
. These
elements include the function call (named call
), the random forest object
fit to the mean (named ranger
),
the spatial linear model object fit to the residuals
(named splm
or splm_list
), and an object can contain data for
locations at which to predict (called newdata
). The newdata
object contains the set of
observations in data
whose response variable is NA
.
If spcov_type
or spcov_initial
(which are passed to splm()
)
are length one, the list has class splmRF
and the spatial linear
model object fit to the residuals is called splm
, which has
class splm
. If
spcov_type
or spcov_initial
are length greater than one, the
list has class splmRF_list
and the spatial linear model object
fit to the residuals is called splm_list
, which has class splm_list
.
and contains several objects, each with class splm
.
An splmRF
object to be used with predict()
. There are
three elements: ranger
, the output from fitting the mean model with
ranger::ranger()
; splm
, the output from fitting the spatial
linear model to the ranger residuals; and newdata
, the newdata
object, if relevant.
Details
The random forest residual spatial linear model is described by
Fox et al. (2020). A random forest model is fit to the mean portion of the
model specified by formula
using ranger::ranger()
. Residuals
are computed and used as the response variable in an intercept-only spatial
linear model fit using splm()
. This model object is intended for use with
predict()
to perform prediction, also called random forest
regression Kriging.
Note
This function does not perform any internal scaling. If optimization is not stable due to large extremely large variances, scale relevant variables so they have variance 1 before optimization.
References
Fox, E.W., Ver Hoef, J. M., & Olsen, A. R. (2020). Comparing spatial regression to random forests for large environmental data sets. PloS one, 15(3), e0229509.
Examples
# \donttest{
sulfate$var <- rnorm(NROW(sulfate)) # add noise variable
sulfate_preds$var <- rnorm(NROW(sulfate_preds)) # add noise variable
sprfmod <- splmRF(sulfate ~ var, data = sulfate, spcov_type = "exponential")
predict(sprfmod, sulfate_preds)
#> 1 2 3 4 5 6 7
#> -6.1377597 29.1450999 17.3945298 12.1213007 3.1330548 28.3866757 11.9909603
#> 8 9 10 11 12 13 14
#> 16.8613352 8.0772730 13.1348366 4.9551673 7.4368405 3.7412313 24.4331860
#> 15 16 17 18 19 20 21
#> 22.8587823 18.6347690 6.8537657 22.6341678 2.6567049 17.9787242 7.0921645
#> 22 23 24 25 26 27 28
#> 13.2062559 3.1248220 11.0909425 -0.7156536 0.7323568 18.8569259 9.7246767
#> 29 30 31 32 33 34 35
#> 10.9095830 8.5869304 4.9370130 15.5692797 11.8491397 8.4556058 1.4987440
#> 36 37 38 39 40 41 42
#> -1.8432463 17.1604659 -2.2961210 22.5938649 3.8583227 8.2193919 23.8198465
#> 43 44 45 46 47 48 49
#> 19.6039883 19.4442878 5.8496071 12.3723358 16.1082116 16.6702312 31.5937004
#> 50 51 52 53 54 55 56
#> 5.3874857 24.0327442 17.9523875 5.3172113 2.2020089 4.1187855 13.5328991
#> 57 58 59 60 61 62 63
#> 17.7205620 3.6965307 20.9015211 5.3679850 13.7801110 27.6225063 20.8816156
#> 64 65 66 67 68 69 70
#> 14.9466737 22.9618088 -5.6887319 7.3171630 9.0156371 -5.4236599 21.7028487
#> 71 72 73 74 75 76 77
#> 15.0909412 30.2295131 9.0684168 3.4210411 4.0409450 -0.8868351 14.8712024
#> 78 79 80 81 82 83 84
#> 10.2057287 5.1052243 13.0483564 20.4023510 0.7128193 3.6507060 11.1887415
#> 85 86 87 88 89 90 91
#> 0.1725768 1.4584439 10.3566991 4.2012282 13.8843085 22.6408813 15.7617740
#> 92 93 94 95 96 97 98
#> 3.7889121 -0.8879800 30.9332639 19.4692391 22.3287174 16.5818020 21.3861348
#> 99 100
#> 27.1407344 14.0203087
# }