Title: | A Computationally Efficient Mean Shift Implementation |
---|---|
Description: | Performs mean shift classification using linear and k-d tree based nearest neighbor implementations for the Gaussian, Epanechnikov, and biweight product kernels. |
Authors: | Jonathan Lisic [aut, cre] |
Maintainer: | Jonathan Lisic <[email protected]> |
License: | GPL (>= 2) |
Version: | 0.56 |
Built: | 2024-11-18 06:45:22 UTC |
Source: | CRAN |
knn_meanShift
performs a search for the k nearest neighbors of a single
point, where nearest is determined by the Mahalanobis distance. This search
is performed through a k-d tree.
knn_meanShift(points, trainData, k = min(5, NROW(trainData)), weight, leafSize = 40, maxDist = Inf)
knn_meanShift(points, trainData, k = min(5, NROW(trainData)), weight, leafSize = 40, maxDist = Inf)
points |
n vectors stored in an n by p matrix. k nearest neighbors are found for each vector. |
trainData |
A matrix or vector of potential nearest neighbors. |
k |
A scalar indicating the number neighbors to find. |
weight |
A scalar or vector of length equal to the number of columns of
|
leafSize |
A scalar used to specify the number of points to store in the leaf nodes. |
maxDist |
A vector specifying the maximum value of the Mahalanobis that will be considered. |
A list is returned containing two items: neighbors
, an n by k
matrix of k indexes for each of the n vectors in points
, corresponding to
the nearest neighbors in trainData
. value
, a matrix of scalars
containing the k distances between the neighbors found in trainData
and points
.
x <- matrix(runif(20),10,2) neighbors <- knn_meanShift(c(0,0),x)
x <- matrix(runif(20),10,2) neighbors <- knn_meanShift(c(0,0),x)
meanShift
performs classification of a set of query points using
steepest ascent to local maxima in a kernel density estimate.
meanShift(queryData, trainData = queryData, nNeighbors = NROW(trainData), algorithm = "LINEAR", kernelType = "NORMAL", bandwidth = rep(1, NCOL(trainData)), alpha = 0, iterations = 10, epsilon = 1e-08, epsilonCluster = 1e-04, parameters = NULL)
meanShift(queryData, trainData = queryData, nNeighbors = NROW(trainData), algorithm = "LINEAR", kernelType = "NORMAL", bandwidth = rep(1, NCOL(trainData)), alpha = 0, iterations = 10, epsilon = 1e-08, epsilonCluster = 1e-04, parameters = NULL)
queryData |
A matrix or vector of points to be classified by the mean shift algorithm. Values must be finite and non-missing. |
trainData |
A matrix or vector of points used to form a kernel density
estimate. The local maxima from this kernel density estimate will be used
for steepest ascent classification. If missing, |
nNeighbors |
A scalar indicating the number neighbors to consider for the kernel density estimate. This is useful to speed up approximation by approximating the kernel density estimate. The default is all data. |
algorithm |
A string indicating the algorithm to use for nearest neighbor searches. Currently, only "LINEAR" and "KDTREE" methods are supported. |
kernelType |
A string indicating the kernel associated with the kernel density estimate that the mean shift is optimizing over. The possible kernels are NORMAL, EPANECHNIKOV, and BIWEIGHT; the default is NORMAL. |
bandwidth |
A vector of length equal to the number of columns in the queryData matrix, or length one when queryData is a vector. This value will be used in the kernel density estimate for steepest ascent classification. The default is one for each dimension. |
alpha |
A scalar tuning parameter for normal kernels. When this parameter is set to zero, the mean shift algorithm will operate as usual. When this parameter is set to one, the mean shift algorithm will be approximated through Newton's Method. When set to a value between zero and one, a generalization of Newton's Method and mean shift will be used instead providing a means to balance convergence speed with stability. The default is zero, mean shift. |
iterations |
The number of iterations to perform mean shift. |
epsilon |
A scalar used to determine when to terminate the iteration of a
individual query point. If the distance between the query point at iteration |
epsilonCluster |
A scalar used to determine the minimum distance between distinct
clusters. This distance is applied after all iterations have finished and in
order of the rows of |
parameters |
A scalar or vector of paramters used by the specific algorithm. There are no optional parameters for the "LINEAR" method, "KDTREE" supports optional parameters for the maximum number of points to store in a leaf node and the maximum value for the quadratic form in the normal kernel, ignoring the constant value -0.5. |
A list is returned containing two items: assignment
, a vector of
classifications. value
, a vector or matrix containing the location of
the classified local maxima in the support, each row is associated with the
classified index in assignment
.
Cheng, Y. (1995). Mean shift, mode seeking, and clustering. IEEE transactions on pattern analysis and machine intelligence, 17(8), 790-799.
Fukunaga, K., & Hostetler, L. (1975). The estimation of the gradient of a density function, with applications in pattern recognition. IEEE transactions on information theory, 21(1), 32-40.
Lisic, J. (2015). Parcel Level Agricultural Land Cover Prediction (Doctoral dissertation, George Mason University).
x <- matrix(runif(20),10,2) classification <- meanShift(x,x) x <- matrix(runif(20),10,2) classification <- meanShift(x, algorithm="KDTREE", nNeighbor=8, parameters=c(5,7.1) )
x <- matrix(runif(20),10,2) classification <- meanShift(x,x) x <- matrix(runif(20),10,2) classification <- meanShift(x, algorithm="KDTREE", nNeighbor=8, parameters=c(5,7.1) )