Package 'GWlasso' reference manual

Title:	Geographically Weighted Lasso
Description:	Performs geographically weighted Lasso regressions. Find optimal bandwidth, fit a geographically weighted lasso or ridge regression, and make predictions. These methods are specially well suited for ecological inferences. Bandwidth selection algorithm is from A. Comber and P. Harris (2018) <doi:10.1007/s10109-018-0280-7>.
Authors:	Matthieu Mulot [aut, cre, cph] , Sophie Erb [aut]
Maintainer:	Matthieu Mulot <[email protected]>
License:	MIT + file LICENSE
Version:	1.0.1
Built:	2025-02-21 07:04:11 UTC
Source:	CRAN

Amesbury Testate Amoebae dataset

Description

Dataset from Amesbury (2016) in Development of a new pan-European testate amoeba transfer function for reconstructing peatland palaeohydrology

Usage

Amesbury
Amesbury

Format

`Amesbury`

This dataset contains the data from Amesbury (2016). In essence, it's a Testate amoebae community table (45 broad TA taxa and 1103 samples)

spe.df: A species x sites dataframe with stites as rows and species in column
WTD: A vector od Water table depth associated with each samples
coords: a dataframe containing the coordinates of each sample

Source

doi:10.1016/j.quascirev.2016.09.024

References

Matthew J. Amesbury, Graeme T. Swindles, Anatoly Bobrov, Dan J. Charman, Joseph Holden, Mariusz Lamentowicz, Gunnar Mallon, Yuri Mazei, Edward A.D. Mitchell, Richard J. Payne, Thomas P. Roland, T. Edward Turner, Barry G. Warner, Development of a new pan-European testate amoeba transfer function for reconstructing peatland palaeohydrology. Quaternary Science Reviews, vol. 152, 2016, pages 132-151. doi:10.1016/j.quascirev.2016.09.024.

Compute distance matrix

Description

compute_distance_matrix() is a small helper function to help you compute a distance matrix. For the geographically method to work, is is important that distances between points are not zero. This function allows to add a small random noise to avoid zero distances.

Usage

compute_distance_matrix(data, method = "euclidean", add.noise = FALSE)
compute_distance_matrix(data, method = "euclidean", add.noise = FALSE)

Arguments

`data`	A dataframe or matrix containing at least two numerical columns.
`method`	method to compute the distance matrix. Ultimately passed to `stats::dist()`. Can be `euclidean`, `maximum`, `manhattan`, `canberra`, `binary` or `minkowski`.
`add.noise`	TRUE/FALSE set to TRUE to add a small noise to the distance matrix. Noise $U$ is generated as $U \sim (1\times 10^{-6}, 5\times 10^{-6})$ . Noise is added only for pairs for which distance is zero.

Value

a distance matrix, usable in gwl_bw_estimation()

Examples

coords <- data.frame("Lat" = rnorm(200), "Long" = rnorm(200))
distance_matrix <- compute_distance_matrix(coords)

coords <- data.frame("Lat" = rnorm(200), "Long" = rnorm(200))
distance_matrix <- compute_distance_matrix(coords)

Bandwidth estimation for Geographically Weighted Lasso

Description

This function performs a bruteforce selection of the optimal bandwidth for the selected kernel to perform a geographically weighted lasso. The user should be aware that this function could be really long to run depending of the settings. We recommend starting with nbw = 5 and nfolds = 5 at first to ensure that the function is running properly and producing the desired output.

Usage

gwl_bw_estimation(
  x.var,
  y.var,
  dist.mat,
  adaptive = TRUE,
  adptbwd.thresh = 0.1,
  kernel = "bisquare",
  alpha = 1,
  progress = TRUE,
  nbw = 100,
  nfolds = 5
)
gwl_bw_estimation(
  x.var,
  y.var,
  dist.mat,
  adaptive = TRUE,
  adptbwd.thresh = 0.1,
  kernel = "bisquare",
  alpha = 1,
  progress = TRUE,
  nbw = 100,
  nfolds = 5
)

Arguments

`x.var`	input matrix, of dimension nobs x nvars; each row is an observation vector. `x` should have 2 or more columns.
`y.var`	response variable for the lasso
`dist.mat`	a distance matrix. can be generated by `compute_distance_matrix()`
`adaptive`	TRUE or FALSE Whether to perform an adaptive bandwidth search or not. A fixed bandwidth means that than samples are selected if they fit a determined fixed radius around a location. in a aptative bandwidth , the radius around a location varies to gather a fixed number of samples around the investigated location
`adptbwd.thresh`	the lowest fraction of samples to take into account for local regression. Must be 0 < `adptbwd.thresh` < 1
`kernel`	the geographical kernel shape to compute the weight. passed to `GWmodel::gw.weight()` Can be `gaussian`, `exponential`, `bisquare`, `tricube`, `boxcar`
`alpha`	the elasticnet mixing parameter. set 1 for lasso, 0 for ridge. see `glmnet::glmnet()`
`progress`	if TRUE, print a progress bar
`nbw`	the number of bandwidth to test
`nfolds`	the number f folds for the glmnet cross validation

Value

a gwlest object. It is a list with rmspe (the RMSPE of the model with the associated badwidth), NA (the number of NA in the dataset), bw (the optimal bandwidth), bwd.vec (the vector of tested bandwidth)

References

A. Comber and P. Harris. Geographically weighted elastic net logistic regression (2018). Journal of Geographical Systems, vol. 20, no. 4, pages 317–341. doi:10.1007/s10109-018-0280-7.

Examples


predictors <- matrix(data = rnorm(2500), 50,50)
y_value <- sample(1:1000, 50)
coords <- data.frame("Lat" = rnorm(50), "Long" = rnorm(50))
distance_matrix <- compute_distance_matrix(coords)


  myst.est <- gwl_bw_estimation(x.var = predictors, 
                                y.var = y_value,
                                dist.mat = distance_matrix,
                                adaptive = TRUE,
                                adptbwd.thresh = 0.5,
                                kernel = "bisquare",
                                alpha = 1,
                                progress = TRUE,
                                n=10,
                                nfolds = 5)
  
  
  myst.est
  

predictors <- matrix(data = rnorm(2500), 50,50)
y_value <- sample(1:1000, 50)
coords <- data.frame("Lat" = rnorm(50), "Long" = rnorm(50))
distance_matrix <- compute_distance_matrix(coords)


  myst.est <- gwl_bw_estimation(x.var = predictors, 
                                y.var = y_value,
                                dist.mat = distance_matrix,
                                adaptive = TRUE,
                                adptbwd.thresh = 0.5,
                                kernel = "bisquare",
                                alpha = 1,
                                progress = TRUE,
                                n=10,
                                nfolds = 5)
  
  
  myst.est

Fit a geographically weighted lasso with the selected bandwidth

Description

Fit a geographically weighted lasso with the selected bandwidth

Usage

gwl_fit(
  bw,
  x.var,
  y.var,
  kernel,
  dist.mat,
  alpha,
  adaptive,
  progress = TRUE,
  nfolds = 5
)
gwl_fit(
  bw,
  x.var,
  y.var,
  kernel,
  dist.mat,
  alpha,
  adaptive,
  progress = TRUE,
  nfolds = 5
)

Arguments

`bw`	Bandwidth
`x.var`	input matrix, of dimension nobs x nvars; each row is an observation vector. x should have 2 or more columns.
`y.var`	response variable for the lasso
`kernel`	the geographical kernel shape to compute the weight. passed to `GWmodel::gw.weight()` Can be `gaussian`, `exponential`, `bisquare`, `tricube`, `boxcar`
`dist.mat`	a distance matrix. can be generated by `compute_distance_matrix()`
`alpha`	the elasticnet mixing parameter. set 1 for lasso, 0 for ridge. see `glmnet::glmnet()`
`adaptive`	TRUE or FALSE Whether to perform an adaptive bandwidth search or not. A fixed bandwidth means that samples are selected if they fit a determined fixed radius around a location. In a adaptive bandwidth, the radius around a location varies to gather a fixed number of samples around the investigated location
`progress`	TRUE/FALSE whether to display a progress bar or not
`nfolds`	the number f folds for the glmnet cross validation

Value

a gwlfit object containing a fitted Geographically weighted Lasso.

Examples


predictors <- matrix(data = rnorm(2500), 50,50)
y_value <- sample(1:1000, 50)
coords <- data.frame("Lat" = rnorm(50), "Long" = rnorm(50))
distance_matrix <- compute_distance_matrix(coords)

my.gwl.fit <- gwl_fit(bw = 20,
                      x.var = predictors, 
                      y.var = y_value,
                      kernel = "bisquare",
                      dist.mat = distance_matrix, 
                      alpha = 1, 
                      adaptive = TRUE, 
                      progress = TRUE,
                      nfolds = 5)

my.gwl.fit


predictors <- matrix(data = rnorm(2500), 50,50)
y_value <- sample(1:1000, 50)
coords <- data.frame("Lat" = rnorm(50), "Long" = rnorm(50))
distance_matrix <- compute_distance_matrix(coords)

my.gwl.fit <- gwl_fit(bw = 20,
                      x.var = predictors, 
                      y.var = y_value,
                      kernel = "bisquare",
                      dist.mat = distance_matrix, 
                      alpha = 1, 
                      adaptive = TRUE, 
                      progress = TRUE,
                      nfolds = 5)

my.gwl.fit

Plot a map of beta coefficient for gwlfit object

Description

this function plots a map of the beta coefficients for a selected column (aka species). For this function to work, the coordinates supplied to gwl_fit() must be named "Lat" and "Long". The function is not bulletproof yet but is added here to reproduce the maps from the original publication.

Usage

plot_gwl_map(x, column, crs = 4326)
plot_gwl_map(x, column, crs = 4326)

Arguments

`x`	a `gwlfit` object returned by `gwl_fit()`.
`column`	the name of a variable to be plotted on the map. Must be quoted. for instance "NEB.MIN"
`crs`	the crs projection for the map (default is mercator WGS84). See `sf::st_crs()`

Value

a ggplot object

Examples


data(Amesbury)
  
distance_matrix <- compute_distance_matrix(Amesbury$coords[1:30,], add.noise = TRUE)
  
my.gwl.fit <- gwl_fit(bw= 20,
                      x.var = Amesbury$spe.df[1:30,],
                      y.var = Amesbury$WTD[1:30],
                      dist.mat = distance_matrix,
                      adaptive = TRUE,
                      kernel = "bisquare",
                      alpha = 1,
                      progress = TRUE)
                        
if(requireNamespace("maps")){
    plot_gwl_map(my.gwl.fit, column = "NEB.MIN")
  }
  
data(Amesbury)
  
distance_matrix <- compute_distance_matrix(Amesbury$coords[1:30,], add.noise = TRUE)
  
my.gwl.fit <- gwl_fit(bw= 20,
                      x.var = Amesbury$spe.df[1:30,],
                      y.var = Amesbury$WTD[1:30],
                      dist.mat = distance_matrix,
                      adaptive = TRUE,
                      kernel = "bisquare",
                      alpha = 1,
                      progress = TRUE)
                        
if(requireNamespace("maps")){
    plot_gwl_map(my.gwl.fit, column = "NEB.MIN")
  }

Plot method for gwlfit object

Description

Plot method for gwlfit object

Usage

## S3 method for class 'gwlfit'
plot(x, ...)
## S3 method for class 'gwlfit'
plot(x, ...)

Arguments

`x`	a `gwlfit` object returned by `gwl_fit()`
`...`	ellipsis for S3 method compatibility

Value

a ggplot

Examples


predictors <- matrix(data = rnorm(2500), 50,50)
y_value <- sample(1:1000, 50)
coords <- data.frame("Lat" = rnorm(50), "Long" = rnorm(50))
distance_matrix <- compute_distance_matrix(coords)

my.gwl.fit <- gwl_fit(bw = 20,
                      x.var = predictors, 
                      y.var = y_value,
                      kernel = "bisquare",
                      dist.mat = distance_matrix, 
                      alpha = 1, 
                      adaptive = TRUE, 
                      progress = TRUE,
                      nfolds = 5)

plot(my.gwl.fit)

predictors <- matrix(data = rnorm(2500), 50,50)
y_value <- sample(1:1000, 50)
coords <- data.frame("Lat" = rnorm(50), "Long" = rnorm(50))
distance_matrix <- compute_distance_matrix(coords)

my.gwl.fit <- gwl_fit(bw = 20,
                      x.var = predictors, 
                      y.var = y_value,
                      kernel = "bisquare",
                      dist.mat = distance_matrix, 
                      alpha = 1, 
                      adaptive = TRUE, 
                      progress = TRUE,
                      nfolds = 5)

plot(my.gwl.fit)

Predict method for gwlfit objects

Description

Predict method for gwlfit objects

Usage

## S3 method for class 'gwlfit'
predict(object, newdata, newcoords, type = "response", verbose = FALSE, ...)
## S3 method for class 'gwlfit'
predict(object, newdata, newcoords, type = "response", verbose = FALSE, ...)

Arguments

`object`	Object of class inheriting from "gwlfit"
`newdata`	a data.frame or matrix with the same columns as the training dataset
`newcoords`	a dataframe or matrix of coordinates of the new data
`type`	the type of response. see `glmnet::predict.glmnet()`
`verbose`	`TRUE` to print info about the execution of the function (useful for very large predictions)
`...`	ellipsis for S3 compatibility. Not used in this function.

Value

a vector of predicted values

Examples


predictors <- matrix(data = rnorm(2500), 50,50)
y_value <- sample(1:1000, 50)
coords <- data.frame("Lat" = rnorm(50), "Long" = rnorm(50))
distance_matrix <- compute_distance_matrix(coords)

my.gwl.fit <- gwl_fit(bw = 20,
                      x.var = predictors, 
                      y.var = y_value,
                      kernel = "bisquare",
                      dist.mat = distance_matrix, 
                      alpha = 1, 
                      adaptive = TRUE, 
                      progress = TRUE,
                      nfolds = 5)
                      
my.gwl.fit

new_predictors <- matrix(data = rnorm(500), 10,50)
new_coords <- data.frame("Lat" = rnorm(10), "Long" = rnorm(10))

predicted_values <- predict(my.gwl.fit,
                             newdata = new_predictors, 
                             newcoords = new_coords)

predictors <- matrix(data = rnorm(2500), 50,50)
y_value <- sample(1:1000, 50)
coords <- data.frame("Lat" = rnorm(50), "Long" = rnorm(50))
distance_matrix <- compute_distance_matrix(coords)

my.gwl.fit <- gwl_fit(bw = 20,
                      x.var = predictors, 
                      y.var = y_value,
                      kernel = "bisquare",
                      dist.mat = distance_matrix, 
                      alpha = 1, 
                      adaptive = TRUE, 
                      progress = TRUE,
                      nfolds = 5)
                      
my.gwl.fit

new_predictors <- matrix(data = rnorm(500), 10,50)
new_coords <- data.frame("Lat" = rnorm(10), "Long" = rnorm(10))

predicted_values <- predict(my.gwl.fit,
                             newdata = new_predictors, 
                             newcoords = new_coords)

Printing gwlest objects

Description

Printing gwlest objects

Usage

## S3 method for class 'gwlest'
print(x, ...)
## S3 method for class 'gwlest'
print(x, ...)

Arguments

`x`	an object of class `gwlest`
`...`	ellipsis for S3 method compatibility

Value

this function print key elements of a gwlest object

Examples


predictors <- matrix(data = rnorm(2500), 50,50)
y_value <- sample(1:1000, 50)
coords <- data.frame("Lat" = rnorm(50), "Long" = rnorm(50))
distance_matrix <- compute_distance_matrix(coords)


myst.est <- gwl_bw_estimation(x.var = predictors, 
                              y.var = y_value,
                              dist.mat = distance_matrix,
                              adaptive = TRUE,
                              adptbwd.thresh = 0.5,
                              kernel = "bisquare",
                              alpha = 1,
                              progress = TRUE,
                              n=10,
                              nfolds = 5)
  
myst.est
  

predictors <- matrix(data = rnorm(2500), 50,50)
y_value <- sample(1:1000, 50)
coords <- data.frame("Lat" = rnorm(50), "Long" = rnorm(50))
distance_matrix <- compute_distance_matrix(coords)


myst.est <- gwl_bw_estimation(x.var = predictors, 
                              y.var = y_value,
                              dist.mat = distance_matrix,
                              adaptive = TRUE,
                              adptbwd.thresh = 0.5,
                              kernel = "bisquare",
                              alpha = 1,
                              progress = TRUE,
                              n=10,
                              nfolds = 5)
  
myst.est

Printing gwlfit objects

Description

Printing gwlfit objects

Usage

## S3 method for class 'gwlfit'
print(x, ...)
## S3 method for class 'gwlfit'
print(x, ...)

Arguments

`x`	a `gwlfit` object
`...`	ellipsis for S3 method compatibility

Value

this function print key elements of a gwlfit object

Examples

predictors <- matrix(data = rnorm(2500), 50,50)
y_value <- sample(1:1000, 50)
coords <- data.frame("Lat" = rnorm(50), "Long" = rnorm(50))
distance_matrix <- compute_distance_matrix(coords)

my.gwl.fit <- gwl_fit(bw = 20,
                      x.var = predictors, 
                      y.var = y_value,
                      kernel = "bisquare",
                      dist.mat = distance_matrix, 
                      alpha = 1, 
                      adaptive = TRUE, 
                      progress = TRUE,
                      nfolds = 5)

my.gwl.fit


predictors <- matrix(data = rnorm(2500), 50,50)
y_value <- sample(1:1000, 50)
coords <- data.frame("Lat" = rnorm(50), "Long" = rnorm(50))
distance_matrix <- compute_distance_matrix(coords)

my.gwl.fit <- gwl_fit(bw = 20,
                      x.var = predictors, 
                      y.var = y_value,
                      kernel = "bisquare",
                      dist.mat = distance_matrix, 
                      alpha = 1, 
                      adaptive = TRUE, 
                      progress = TRUE,
                      nfolds = 5)

my.gwl.fit

Package 'GWlasso'

Help Index

Amesbury Testate Amoebae dataset

Description

Usage

Format

Amesbury

Source

References

Compute distance matrix

Description

Usage

Arguments

Value

Examples

Bandwidth estimation for Geographically Weighted Lasso

Description

Usage

Arguments

Value

References

Examples

Fit a geographically weighted lasso with the selected bandwidth

Description

Usage

Arguments

Value

Examples

Plot a map of beta coefficient for gwlfit object

Description

Usage

Arguments

Value

Examples

Plot method for gwlfit object

Description

Usage

Arguments

Value

Examples

Predict method for gwlfit objects

Description

Usage

Arguments

Value

Examples

Printing gwlest objects

Description

Usage

Arguments

Value

Examples

Printing gwlfit objects

Description

Usage

Arguments

Value

Examples

`Amesbury`