--- title: "Examples of basic uses of mgwrsar package" author: "Ghislain Geniaux" date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Examples of basic uses of mgwrsar package} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` Introduction === mgwrsar package estimates linear and local linear models with spatial autocorrelation of the following forms: $$y=\beta_v(u_i,v_i) X_v+\,\epsilon_i\;\;\;\;\;\;\;\; (GWR)$$ $$y=\beta_c X_c+\beta_v(u_i,v_i) X_v+\,\epsilon_i\;\;\;\;\;\;\;\; (MGWR)$$ $$y=\lambda Wy+\beta_c X_c+\,\epsilon_i\;\;\;\;\;\;\;\; (MGWR-SAR(0,k,0))$$ $$y=\lambda Wy+\beta_v(u_i,v_i)X_v+\,\epsilon_i\;\;\;\;\;\;\;\; (MGWR-SAR(0,0,k))$$ $$y=\lambda Wy+\beta_c X_c+\beta_v(u_i,v_i)X_v+\,\epsilon_i\;\;\;\;\;\;\;\; (MGWR-SAR(0,k_c,k_v))$$ $$y=\lambda(u_i,v_i) Wy+\beta_c X_c+\,\epsilon_i\;\;\;\;\;\;\;\; (MGWR-SAR(1,k,0))$$ $$y=\lambda(u_i,v_i)Wy+\beta_v(u_i,v_i)X_v+\,\epsilon_i\;\;\;\;\;\;\;\; (MGWR-SAR(1,0,k))$$ $$y=\lambda(u_i,v_i)Wy+\beta_cX_c+\beta_v(u_i,v_i)X_v+\,\epsilon_i\;\;\;\;\;\;\;\; (MGWR-SAR(1,k_c,k_v))$$ For more details on the estimation method, see Geniaux, G. and Martinetti, D. (2017). A new method for dealing simultaneously with spatial autocorrelation and spatial heterogeneity in regression models. Regional Science and Urban Economics. (https://doi.org/10.1016/j.regsciurbeco.2017.04.001) The estimation of the previous model can be done using the MGWRSAR wrapper function which requires a dataframe and a matrix of coordinates. Note that: * When the model implies spatial autocorrelation, a row normalized spatial weight matrix must be provided. * 2SLS and Best 2SLS method can be used. When model imply local regressions, a bandwidth and a kernel type must be provided. * When the model implies mixed local regression, the name of stationary covariates must be provided. * Optimal bandwidth can be estimated using bandwidths_mgwrsar function. In addition to the ability of considering spatial autocorrelation in GWR/MGWR like models, MGWRSAR function introduces several useful technics for estimating local regression with space coordinates: * it uses RCCP and RCCPeigen code that speed up computations and allows parallel computing (doParallel package), * it allows to drop out variables with not enough local variance in local regression, which allows to consider dummies in GWR/MGWR framework without trouble. * it allows to drop out local outliers in local regression. * it allows to consider additional variable for kernel, including time (asymmetric kernel) and categorical variables (see Li and Racine 2010). Experimental version. Using mgwrsar package === The MGWRSAR function requires a dataframe and a matrix of coordinates, and eventually a spatial weight matrix if model include spatiale dependence. The package includes a simulated data example which is used for this vignette. ```{r load_data} library(mgwrsar) library(dplyr) ## loading data example data(mydata) coords=as.matrix(mydata[,c("x","y")]) ## Creating a spatial weight matrix (sparce dgCMatrix) of 8 nearest neighbors with 0 in diagonal W=kernel_matW(H=4,kernels='rectangle',coord_i=coords,NN=4,adaptive=TRUE,diagnull=TRUE,rowNorm=T) ``` Estimation ---- Estimation of a GWR with a gauss kernel without parallel computation: ```{r GWR1} ### without parallel computing with distance computation for all points ptm1<-proc.time() model_GWR0<-MGWRSAR(formula = 'Y_gwr~X1+X2+X3', data = mydata,coords=coords, fixed_vars=NULL,kernels=c('gauss'),H=0.13, Model = 'GWR',control=list(SE=T,get_ts=TRUE)) (proc.time()-ptm1)[3] summary_mgwrsar(model_GWR0) ### with parallel computing with distance computation for all points ptm1<-proc.time() model_GWR0_parallel<-MGWRSAR(formula = 'Y_gwr~X1+X2+X3', data = mydata,coords=coords, fixed_vars=NULL,kernels=c('gauss'),H=0.03, Model = 'GWR',control=list(SE=T,ncore=2,doMC=TRUE)) (proc.time()-ptm1)[3] # W<-spdep::mat2listw(W) # # gp2mM <- lagsarlm(Y_gwr ~ X1+X2+X3,data=mydata, lW, method="Matrix") # # mdo<-s2sls(Y_gwr ~ X1+X2+X3,data=mydata,Produc,W) # # mm<-lagmess(Y_gwr ~ X1+X2+X3,data=mydata, lW) ## how to speed up computation using rough kernels (0 weights for nearest neighbors > 300th position) ptm1<-proc.time() model_GWR<-MGWRSAR(formula = 'Y_gwr~X1+X2+X3', data = mydata,coords=coords, fixed_vars=NULL,kernels=c('gauss'),H=0.03, Model = 'GWR',control=list(SE=TRUE,NN=300)) (proc.time()-ptm1)[3] summary_mgwrsar(model_GWR) ## how to speed up computation using rough kernels ( 0 weights for nearest neighbors > 300th position) TP=find_TP(formula = 'Y_gwr~X1+X2+X3', data =mydata,coords=coords,K=10,type='residuals') # only 90 targets points are used length(TP) ## how to speed up computation using Target points ptm1<-proc.time() model_GWR<-MGWRSAR(formula = 'Y_gwr~X1+X2+X3', data = mydata,coords=coords, fixed_vars=NULL,kernels=c('gauss'),H=0.03, Model = 'GWR',control=list(SE=TRUE,TP=TP)) (proc.time()-ptm1)[3] ## how to speed up computation using rough kernels ( 0 weights for nearest neighbors > 300th position) and Target points ptm1<-proc.time() model_GWR<-MGWRSAR(formula = 'Y_gwr~X1+X2+X3', data = mydata,coords=coords, fixed_vars=NULL,kernels=c('gauss'),H=0.1, Model = 'GWR',control=list(SE=TRUE,NN=300,TP=TP)) (proc.time()-ptm1)[3] ## how to speed up computation using adapative kernels ptm1<-proc.time() model_GWR<-MGWRSAR(formula = 'Y_gwr~X1+X2+X3', data = mydata,coords=coords, fixed_vars=NULL,kernels=c('bisq'),H=20, Model = 'GWR',control=list(SE=TRUE,adaptive=TRUE)) (proc.time()-ptm1)[3] ## how to speed up computation using adapative kernels and Target points ptm1<-proc.time() model_GWR<-MGWRSAR(formula = 'Y_gwr~X1+X2+X3', data = mydata,coords=coords, fixed_vars=NULL,kernels=c('bisq'),H=20, Model = 'GWR',control=list(SE=TRUE,adaptive=TRUE,TP=TP)) (proc.time()-ptm1)[3] library(microbenchmark) res=microbenchmark(MGWRSAR(formula = 'Y_gwr~X1+X2+X3', data = mydata,coords=coords, fixed_vars=NULL,kernels=c('gauss'),H=0.1, Model = 'GWR',control=list(SE=TRUE)),MGWRSAR(formula = 'Y_gwr~X1+X2+X3', data = mydata,coords=coords, fixed_vars=NULL,kernels=c('gauss'),H=0.1, Model = 'GWR',control=list(SE=TRUE,NN=300,TP=TP)),times=2) res res=microbenchmark(MGWRSAR(formula = 'Y_gwr~X1+X2+X3', data = mydata,coords=coords, fixed_vars=NULL,kernels=c('gauss'),H=0.1, Model = 'GWR',control=list(SE=TRUE)),MGWRSAR(formula = 'Y_gwr~X1+X2+X3', data = mydata,coords=coords, fixed_vars=NULL,kernels=c('bisq'),H=20, Model = 'GWR',control=list(SE=TRUE,adaptive=T,TP=TP)),times=2) res ``` Summary and plot of mgwrsar object using leaflet: ```{r plot1} summary_mgwrsar(model_GWR0) plot_mgwrsar(model_GWR0,type='B_coef',var='X2') plot_mgwrsar(model_GWR0,type='t_coef',var='X2') ``` Plot of the effects of spatially varying coefficients: ```{r plot2} plot_effect(model_GWR0,title='Effects') ``` Estimation of a GWR with spgwr package: ```{r spgwr, eval=FALSE} library(spgwr) mydataSP=mydata coordinates(mydataSP)=c('x','y') ptm1<-proc.time() model_spgwr <- gwr(Y_gwr~X1+X2+X3, data=mydataSP, bandwidth=0.13,hatmatrix=TRUE) (proc.time()-ptm1)[3] head(model_spgwr$SDF@data$gwr.e) model_spgwr all(abs(model_GWR0$residuals-model_spgwr$SDF@data$gwr.e)<0.00000000001) ``` [1] -0.1585824 -0.1861144 -0.4450174 -0.1767796 -0.2014330 [6] 0.2670349 Call: gwr(formula = Y_gwr ~ X1 + X2 + X3, data = mydataSP, bandwidth = 0.13, hatmatrix = TRUE) Kernel function: gwr.Gauss Fixed bandwidth: 0.13 Summary of GWR coefficient estimates at data points: Min. 1st Qu. Median 3rd Qu. Max. Global X.Intercept. -2.63740 -1.46100 -1.08812 -0.62307 1.06100 -1.4845 X1 0.16880 0.38590 0.51409 0.59454 0.79570 0.5000 X2 -0.37306 1.52729 1.94806 2.58871 3.37755 2.5481 X3 -1.75876 -1.30704 -1.01941 -0.74386 -0.30828 -1.0762 Number of data points: 1000 Effective number of parameters (residual: 2traceS - traceS'S): 62.85371 Effective degrees of freedom (residual: 2traceS - traceS'S): 937.1463 Sigma (residual: 2traceS - traceS'S): 0.2614678 Effective number of parameters (model: traceS): 44.93259 Effective degrees of freedom (model: traceS): 955.0674 Sigma (model: traceS): 0.2590031 Sigma (ML): 0.2531174 AICc (GWR p. 61, eq 2.33; p. 96, eq. 4.21): 186.462 AIC (GWR p. 96, eq. 4.22): 135.0056 Residual sum of squares: 64.0684 Quasi-global R2: 0.9813492 [1] TRUE Estimation of a MGWR with stationary intercept using an adapative gaussian kernel (based on 20 nearest neighbors): ```{r MGWR1} model_MGWR<-MGWRSAR(formula = 'Y_mgwr~X1+X2+X3', data = mydata,coords=coords, fixed_vars='X3',kernels=c('gauss'),H=20, Model = 'MGWR',control=list(SE=T,adaptive=T)) summary_mgwrsar(model_MGWR) ``` Estimation of a mgwrsar(0,kc,kv) model with stationary intercept and stationnary spatial multiplier using an adapative gaussian kernel (based on 20 nearest neighbors): ```{r mgwrsar_0_kc_kv} mgwrsar_0_kc_kv<-MGWRSAR(formula = 'Y_mgwrsar_0_kc_kv~X1+X2+X3', data = mydata,coords=coords, fixed_vars='X2',kernels=c('gauss'),H=20, Model = 'MGWRSAR_0_kc_kv',control=list(SE=F,adaptive=T,W=W)) summary_mgwrsar(mgwrsar_0_kc_kv) ``` Estimation of a mgwrsar(0,0,kv) model with stationnary spatial multiplier using an adapative gaussian kernel (based on 20 nearest neighbors): ```{r mgwrsar_0_0_kv} mgwrsar_0_0_kv<-MGWRSAR(formula = 'Y_mgwrsar_0_kc_kv~X1+X2+X3', data = mydata,coords=coords, fixed_vars=NULL,kernels=c('gauss'),H=20, Model = 'MGWRSAR_0_0_kv',control=list(SE=F,adaptive=T,W=W)) summary_mgwrsar(mgwrsar_0_0_kv) ``` Estimation of a mgwrsar(1,0,kv) model with all parameter spatially varying using an adapative gaussian kernel (based on 20 nearest neighbors): ```{r mgwrsar_1_0_kv} mgwrsar_1_0_kv<-MGWRSAR(formula = 'Y_mgwrsar_1_0_kv~X1+X2+X3', data = mydata,coords=coords, fixed_vars=NULL,kernels=c('gauss'),H=20, Model = 'MGWRSAR_1_0_kv',control=list(SE=F,adaptive=T,W=W)) summary_mgwrsar(mgwrsar_1_0_kv) ``` Estimation of a mgwrsar(1,kc,kv) model with stationary X1 using an adapative gaussian kernel (based on 20 nearest neighbors): ```{r mgwrsar_1_kc_kv} mgwrsar_1_kc_kv<-MGWRSAR(formula = 'Y_mgwrsar_1_kc_kv~X1+X2+X3', data = mydata,coords=coords, fixed_vars='X2',kernels=c('gauss'),H=20, Model = 'MGWRSAR_1_kc_kv',control=list(SE=F,adaptive=T,W=W)) summary_mgwrsar(mgwrsar_1_kc_kv) ``` Estimation of a mgwrsar(1,kc,0) model with stationary X1 using an adapative gaussian kernel (based on 20 nearest neighbors): ```{r mgwrsar_1_kc_0} mgwrsar_1_kc_0<-MGWRSAR(formula = 'Y_mgwrsar_1_kc_kv~X1+X2+X3', data = mydata,coords=coords, fixed_vars=c('Intercept','X1','X2','X3'),kernels=c('gauss'),H=20, Model = 'MGWRSAR_1_kc_0',control=list(SE=F,adaptive=T,W=W)) summary_mgwrsar(mgwrsar_1_kc_0) ``` Prediction ---- In this example, we use an in-sample of size 800 for model estimation and an out-sample of size 200 for prediction. Note that for model with spatial autocorrelation you must provide a global weight matrix of size 1000, ordered by in-sample then out-sample locations. For GWR and MGWR_1_0_kv, where all coefficients vary in space, the predictions are carried out using the corresponding model estimate with the out-sample location as target points, so the estimated coefficients are not used: only the call and the data of the estimated model are used. For mixed model that have some constant coefficients (MGWR, MGXWR_0_0_kv, MGXWR_1_kc_kv, MGXWR_1_kc_0), the estimated coefficients are extrapolated using a weighting matrix. By default, the weighting matrix for prediction is based on the expected weights of outsample data if they were had been added to insample data to estimate the corresponding MGWRSAR (see Geniaux 2022 for further detail). The user can also choose to extrapolate the model coefficients using a shepperd kernel with custom number of neighbours or using the same kernel and bandwidth as the estimated model. Prediction of GWR model using sheppard kernel with 8 neighbors: ```{r Prediction1} length_out=800 index_in=sample(1:1000,length_out) index_out=(1:1000)[-index_in] model_GWR_insample<-MGWRSAR(formula = 'Y_gwr~X1+X2+X3', data = mydata[index_in,],coords=coords[index_in,], fixed_vars=NULL,kernels=c('gauss'),H=8, Model = 'GWR',control=list(adaptive=T)) summary_mgwrsar(model_GWR_insample) newdata=mydata[index_out,] newdata_coords=coords[index_out,] newdata$Y_mgwrsar_1_0_kv=0 Y_pred=predict_mgwrsar(model_GWR_insample, newdata=newdata, newdata_coords=newdata_coords) head(Y_pred) head(mydata$Y_gwr[index_out]) sqrt(mean((mydata$Y_gwr[index_out]-Y_pred)^2)) # RMSE model_MGWR_insample<-MGWRSAR(formula = 'Y_gwr~X1+X2+X3', data = mydata[index_in,],coords=coords[index_in,], fixed_vars='X3',kernels=c('gauss'),H=8, Model = 'MGWR',control=list(adaptive=T)) summary_mgwrsar(model_MGWR_insample) newdata=mydata[index_out,] newdata_coords=coords[index_out,] newdata$Y_mgwrsar_1_0_kv=0 ## Prediction with method_pred='tWtp' Y_pred=predict_mgwrsar(model_MGWR_insample, newdata=newdata, newdata_coords=newdata_coords) head(Y_pred) head(mydata$Y_gwr[index_out]) sqrt(mean((mydata$Y_gwr[index_out]-Y_pred)^2)) # RMSE ``` Prediction of *MGWRSAR_1_0_kv* model using sheppard kernel with 4 neighbors and Best Linear Unbiased Predictor (BLUP) : ```{r Prediction2} length_out=800 index_in=sample(1:1000,length_out) index_out=(1:1000)[-index_in] ### Global Spatial Weight matrix W should be ordered by in_sample (S) then out_sample W_in_out=kernel_matW(H=4,kernels='rectangle',coord_i=rbind(coords[index_in,],coords[index_out,]),NN=4,adaptive=TRUE,diagnull=TRUE,rowNorm=T) W_in=W_in_out[1:length(index_in),1:length(index_in)] W_in=mgwrsar::normW(W_in) newdata=mydata[index_out,] newdata_coords=coords[index_out,] ### SAR Model model_SAR_insample<-MGWRSAR(formula = 'Y_sar~X1+X2+X3', data = mydata[index_in,],coords=coords[index_in,], fixed_vars=NULL,kernels=NULL,H=11, Model = 'SAR',control=list(W=W_in)) model_SAR_insample$RMSE summary_mgwrsar(model_SAR_insample) ## without Best Linear Unbiased Predictor # prediction YTC Y_pred=predict_mgwrsar(model_SAR_insample, newdata=newdata, newdata_coords=newdata_coords,W=W_in_out,type='YTC') head(Y_pred) RMSE_YTC=sqrt(mean((mydata$Y_sar[index_out]-Y_pred)^2)) RMSE_YTC ## Using Best Linear Unbiased Predictor Y_pred=predict_mgwrsar(model_SAR_insample, newdata=newdata, newdata_coords=newdata_coords,W=W_in_out,type='BPN') head(Y_pred) RMSE_BPN=sqrt(mean((mydata$Y_sar[index_out]-Y_pred)^2)) RMSE_BPN #### MGWRSAR_1_0_kv model_MGWRSAR_1_0_kv_insample<-MGWRSAR(formula = 'Y_mgwrsar_1_0_kv~X1+X2+X3', data = mydata[index_in,],coords=coords[index_in,], fixed_vars=NULL,kernels=c('gauss'),H=11, Model = 'MGWRSAR_1_0_kv',control=list(W=W_in,adaptive=TRUE,isgcv=F)) model_MGWRSAR_1_0_kv_insample$RMSE summary_mgwrsar(model_MGWRSAR_1_0_kv_insample) # prediction with method_pred='tWtp' Y_pred=predict_mgwrsar(model_MGWRSAR_1_0_kv_insample, newdata=newdata, newdata_coords=newdata_coords,W=W_in_out,type='BPN',method_pred='tWtp') head(Y_pred) RMSE_tWtp_BPN=sqrt(mean((mydata$Y_mgwrsar_1_0_kv[index_out]-Y_pred)^2)) RMSE_tWtp_BPN ## Using Best Linear Unbiased Predictor with method_pred='model' Y_pred=predict_mgwrsar(model_MGWRSAR_1_0_kv_insample, newdata=newdata, newdata_coords=newdata_coords,W=W,type='BPN',method_pred='model' ) head(Y_pred) RMSE_model_BPN=sqrt(mean((mydata$Y_mgwrsar_1_0_kv[index_out]-Y_pred)^2)) RMSE_model_BPN ## Using Best Linear Unbiased Predictor with method_pred='sheppard' W_out=kernel_matW(H=4,kernels='rectangle',coord_i=coords[index_out,],NN=4,adaptive=TRUE,diagnull=TRUE,rowNorm=T) Y_pred=predict_mgwrsar(model_MGWRSAR_1_0_kv_insample, newdata=newdata, newdata_coords=newdata_coords,W=W,type='BPN',method_pred='sheppard',k_extra=8) head(Y_pred) RMSE_sheppard_BPN=sqrt(mean((mydata$Y_mgwrsar_1_0_kv[index_out]-Y_pred)^2)) RMSE_sheppard_BPN ``` Prediction of *MGWRSAR_1_kc_kv* model using sheppard kernel with 4 neighbors and Best Linear Unbiased Predictor (BLUP) : ```{r Prediction3} length_out=800 index_in=sample(1:1000,length_out) index_out=(1:1000)[-index_in] ### Global Spatial Weight matrix W should be ordered by in_ sample (S) then out_sample W=kernel_matW(H=4,kernels='rectangle',coord_i=rbind(coords[index_in,],coords[index_out,]),NN=4,adaptive=TRUE,diagnull=TRUE,rowNorm=T) W_in=W[index_in,index_in] W_in=mgwrsar::normW(W_in) model_MGWRSAR_1_kc_kv_insample<-MGWRSAR(formula = 'Y_mgwrsar_1_kc_kv~X1+X2+X3', data = mydata[index_in,],coords=coords[index_in,], fixed_vars='X3',kernels=c('gauss'),H=11, Model = 'MGWRSAR_1_kc_kv',control=list(W=W_in,adaptive=TRUE,isgcv=F)) model_MGWRSAR_1_kc_kv_insample$RMSE summary_mgwrsar(model_MGWRSAR_1_kc_kv_insample) ## without Best Linear Unbiased Predictor newdata=mydata[index_out,] newdata_coords=coords[index_out,] newdata$Y_mgwrsar_1_kc_kv=0 Y_pred=predict_mgwrsar(model_MGWRSAR_1_kc_kv_insample, newdata=newdata, newdata_coords=newdata_coords,W=W,type='YTC') head(Y_pred) RMSE_YTC=sqrt(mean((mydata$Y_mgwrsar_1_kc_kv[index_out]-Y_pred)^2)) RMSE_YTC ## Using Best Linear Unbiased Predictor Y_pred=predict_mgwrsar(model_MGWRSAR_1_kc_kv_insample, newdata=newdata, newdata_coords=newdata_coords,W=W,type='BPN')#,method_pred='sheppard') head(Y_pred) RMSE_BPN=sqrt(mean((mydata$Y_mgwrsar_1_kc_kv[index_out]-Y_pred)^2)) RMSE_BPN ``` Optimal bandwidths search ---- In the following example, we show how to find optimal bandwidth by hand. ```{r bandwidths_search,eval=FALSE} ###################### #### Finding bandwidth by hand ##################### model_GWR<-MGWRSAR(formula = 'Y_gwr~X1+X2+X3', data = mydata,coords=coords, fixed_vars=NULL,kernels=c('gauss'),H=0.03, Model = 'GWR',control=list(isgcv=TRUE,adaptive=FALSE)) summary_mgwrsar(model_GWR) myCV<-function(H){ model_GWR<-MGWRSAR(formula = 'Y_gwr~X1+X2+X3', data = mydata,coords=coords, fixed_vars=NULL,kernels=c('gauss'),H, Model = 'GWR',control=list(isgcv=TRUE,adaptive=FALSE)) cat('... Try h=',H,' ') CV=model_GWR$RMSE CV } resCV=optimize(myCV,lower=0.01,upper=0.4) resCV ## model with optimal bandwith with gaussian kernel model_GWR<-MGWRSAR(formula = 'Y_gwr~X1+X2+X3', data = mydata,coords=coords, fixed_vars=NULL,kernels=c('gauss'),H=resCV$minimum, Model = 'GWR',control=list(isgcv=F,adaptive=F)) summary_mgwrsar(model_GWR) ``` Call: MGWRSAR(formula = "Y_gwr~X1+X2+X3", data = mydata, coord = coord, fixed_vars = NULL, kernels = c("gauss"), H = 0.03, Model = "GWR", control = list(isgcv = TRUE, adaptive = FALSE)) Model: GWR Kernels function: gauss Kernels adaptive: NO Kernels type: GD Bandwidth: 0.03 Computation time: 0.319 Use of parallel computing: FALSE ncore = 1 Use of rough kernel: NO Use of Target Points: NO Number of data points: 1000 Varying parameters: Intercept X1 X2 X3 Intercept X1 X2 X3 Min. -4.396980 0.016334 -0.745746 -2.0461 1st Qu. -1.483364 0.350953 1.087729 -1.2907 Median -0.827288 0.543547 1.637012 -1.0072 Mean -0.827471 0.499977 1.673425 -1.0023 3rd Qu. -0.211782 0.636800 2.446022 -0.7158 Max. 2.077957 0.950236 4.189594 0.0364 Residual sum of squares: 8.096586 RMSE: 0.08998103 ... Try h= 0.1589667 ... Try h= 0.2510333 ... Try h= 0.1020665 ... Try h= 0.06690023 ... Try h= 0.04516628 ... Try h= 0.03173396 ... Try h= 0.01560834 ... Try h= 0.02557452 ... Try h= 0.03239781 ... Try h= 0.03102658 ... Try h= 0.02894408 ... Try h= 0.03069048 ... Try h= 0.0307406 ... Try h= 0.03078129 ... Try h= 0.0307406 $minimum [1] 0.0307406 $objective [1] 0.0899334 Call: MGWRSAR(formula = "Y_gwr~X1+X2+X3", data = mydata, coord = coord, fixed_vars = NULL, kernels = c("gauss"), H = resCV$minimum, Model = "GWR", control = list(isgcv = F, adaptive = F)) Model: GWR Kernels function: gauss Kernels adaptive: NO Kernels type: GD Bandwidth: 0.0307406 Computation time: 0.316 Use of parallel computing: FALSE ncore = 1 Use of rough kernel: NO Use of Target Points: NO Number of data points: 1000 Varying parameters: Intercept X1 X2 X3 Intercept X1 X2 X3 Min. -3.775736 0.017713 -0.751288 -2.0367 1st Qu. -1.490655 0.351921 1.104662 -1.2919 Median -0.830822 0.541941 1.629693 -1.0085 Mean -0.831229 0.500069 1.675354 -1.0013 3rd Qu. -0.235924 0.637217 2.450903 -0.7251 Max. 2.079699 0.941899 3.938823 0.0426 Residual sum of squares: 1.900295 RMSE: 0.04359237 mgwrsar package provides a wrapper that allows to find optimal bandwidth for a selection of model and kernel types. It is designed only for spatial kernel (type='GD'). It stores results for all models and kernels using an optimal bandwidth (leave-one out CV criteria) and provides both results with and without leave-one-out sampling. It also allows to find an optimal spatial weight matrix for a given kernel when the model includes spatial dependence. Example of automatic search of optimal bandwidth for GWR and MGWR (stationnary intercept) models using either an adaptive gaussian kernel or adapative bisquare kernel or gaussian kernel, with at least 8 neigbhors for adaptive kernel and at least a bandwidth of 0.03 for gaussian kernel: ```{r bandwidths_mgwrsar1,eval=FALSE} mytab<-bandwidths_mgwrsar(formula = 'Y_gwr~X1+X2+X3', data = mydata,coords=coords, fixed_vars=c('Intercept','X1'),Models=c('GWR','MGWR'),candidates_Kernels=c('bisq','gauss'),control=list(NN=300,adaptive=TRUE),control_search=list()) names(mytab) names(mytab[['GWR_bisq_adaptive']]) mytab[['GWR_bisq_adaptive']]$config_model mytab[['GWR_bisq_adaptive']]$CV summary(mytab[['GWR_bisq_adaptive']]$model$Betav) mybestmodel=mytab[['GWR_gauss_adaptive']]$model ``` ##### GWR ##### ##### bisq adaptive= TRUE ##### ######## Search bandwidth stage ######## ....kernel = bisq adaptive objective = 0.098356 minimum = 21 ##### gauss adaptive= TRUE ##### ######## Search bandwidth stage ######## ....kernel = gauss adaptive objective = 0.09718914 minimum = 5 ##### MGWR ##### ##### bisq adaptive= TRUE ##### ######## Search bandwidth stage ######## ..... ...try larger suppport ........ Border solution !!! Try to increase NNkernel = bisq adaptive objective = 0.4496839 minimum = 94 ##### gauss adaptive= TRUE ##### ######## Search bandwidth stage ######## ....kernel = gauss adaptive objective = 0.4523508 minimum = 16 [1] "GWR_bisq_adaptive" "GWR_gauss_adaptive" [3] "MGWR_bisq_adaptive" "MGWR_gauss_adaptive" [1] "config_model" "CV" "SSR" "model" $Model [1] "GWR" $kernels [1] "bisq" $adaptive [1] TRUE $H [1] 21 $kernel_w [1] "rectangle" $h_w NULL [1] 0.098356 Intercept X1 X2 Min. :-3.6889 Min. :0.0165 Min. :-0.8843 1st Qu.:-1.4614 1st Qu.:0.3526 1st Qu.: 1.0836 Median :-0.8098 Median :0.5425 Median : 1.6525 Mean :-0.8239 Mean :0.4992 Mean : 1.6717 3rd Qu.:-0.2008 3rd Qu.:0.6358 3rd Qu.: 2.4624 Max. : 2.1706 Max. :0.9191 Max. : 3.7134 X3 Min. :-2.12121 1st Qu.:-1.29786 Median :-1.00958 Mean :-1.00068 3rd Qu.:-0.71484 Max. :-0.03991 Example of automatic search of optimal bandwidth for MGWRSAR_0_kc_kv (stationnary intercept) model using an adaptive gaussian kernel + Automatic search of optimal spatial weight matrix for SAR part of the model using either a bisquare or adpative bisquare kernel, not run. ```{r bandwidths_mgwrsar2, eval=FALSE} mytab2<-bandwidths_mgwrsar(formula='Y_gwr~X1+X2+X3', data = mydata,coords=coords, fixed_vars=NULL,Models=c('MGWRSAR_0_0_kv'),candidates_Kernels=c('bisq'),control=list(adaptive=TRUE,NN=500),control_search=list(search_W=TRUE,kernel_w='rectangle',adaptive_W=TRUE)) names(mytab2) mytab2[['MGWRSAR_0_0_kv_bisq_adaptive']]$config_model mytab2[['MGWRSAR_0_0_kv_bisq_adaptive']]$CV mytab2[['MGWRSAR_0_0_kv_bisq_adaptive']]$model$Betac summary(mytab2[['MGWRSAR_0_0_kv_bisq_adaptive']]$model$Betav) ``` ##### MGWRSAR_0_0_kv ##### ##### bisq adaptive= TRUE ##### ######## Search W stage ######## ......... ..... W : kernel = rectangle adaptive : objective = 0.226692 minimum = 16 ######## Search bandwidth stage ######## .....kernel = bisq adaptive objective = 0.1016582 minimum = 22 [1] "MGWRSAR_0_0_kv_bisq_adaptive" $Model [1] "MGWRSAR_0_0_kv" $kernels [1] "bisq" $adaptive [1] TRUE $H [1] 22 $kernel_w [1] "rectangle" $h_w NULL [1] 0.1016582 lambda 0.0419462 Intercept X1 X2 Min. :-3.7090 Min. :0.02378 Min. :-0.9346 1st Qu.:-1.5217 1st Qu.:0.35387 1st Qu.: 1.0550 Median :-0.8644 Median :0.54467 Median : 1.5860 Mean :-0.8622 Mean :0.50028 Mean : 1.6214 3rd Qu.:-0.2208 3rd Qu.:0.63670 3rd Qu.: 2.3904 Max. : 2.1615 Max. :0.91634 Max. : 3.6352 X3 Min. :-2.12126 1st Qu.:-1.29790 Median :-1.01142 Mean :-1.00342 3rd Qu.:-0.71371 Max. :-0.05576 General kernel Product functions ---- The package provides additional functions that allow to estimate locally linear model with other dimensions than space. Using the control variable 'type', it is possible to add time in the kernel and a limited set of other variables (continuous or categorical). If several dimensions are involved in the kernel, a general product kernel is used and you need to provide a list of bandwidths and a list of kernel types. For time kernel, it uses asymetric kernel, eventually with a decay. For categorical variable, it uses the propositions of Li and Racine (2010); see also np package. Optimization of bandwidths has to be done by yourself. Note that when time or other additional variables are used for kernels, then two small bandwidths could lead to empty local subsamples. We recommend to use gauss and gauss_adapt kernels that suffering less this issue. ```{r EXPERIMENTAL, eval=TRUE} ## space + time kernel mytime_index=sort(rep(1:500,2)) mytime_index[1:150] W=kernel_matW(H=8,kernels='rectangle',coord_i=coords,NN=10,adaptive=TRUE,diagnull=TRUE,rowNorm=T) model_MGWRSART_0_kc_kv<-MGWRSAR(formula = 'Y_mgwrsar_0_kc_kv~X1+X2+X3', data = mydata,coords=coords, fixed_vars=c('Intercept'),kernels=c('gauss','gauss'),H=c(50,50), Model = 'MGWRSAR_0_kc_kv',control=list(Z=mytime_index,W=W,adaptive=c(TRUE,TRUE),Type='GDT')) summary_mgwrsar(model_MGWRSART_0_kc_kv) ### space + continuous variable kernel model_GWRX<-MGWRSAR(formula = 'Y_gwr~X1+X2+X3', data = mydata,coords=coords, fixed_vars=NULL,kernels=c('gauss','gauss'),H=c(120,120), Model = 'GWR',control=list(Z=mydata$X2,Type='GDX',adaptive=c(T,T))) summary_mgwrsar(model_GWRX) ### space + categorical kernel (Li and Racine 2010) Z=1+as.numeric(mydata$X2>quantile(mydata$X2,0.9))*2+as.numeric(mydata$X2<=quantile(mydata$X2,0.1)) table(Z) model_MGWRSARC_0_kc_kv<-MGWRSAR(formula = 'Y_mgwrsar_0_kc_kv~X1+X3', data = mydata,coords=coords, fixed_vars=c('Intercept'),kernels=c('gauss','li','li','li'),H=c(120,0.1,0.9,0.9), Model = 'MGWRSAR_0_kc_kv',control=list(Z=Z,W=W,Type='GDC',adaptive=c(T,F,F,F))) summary_mgwrsar(model_MGWRSARC_0_kc_kv) ```