Title: | Bayesian Multidimensional Scaling and Choice of Dimension |
---|---|
Description: | Bayesian approach to multidimensional scaling. The package consists of implementations of the methods of Oh and Raftery (2001) <doi:10.1198/016214501753208690>. |
Authors: | Man-Suk Oh [aut, cre], Eun-Kyung Lee [aut] |
Maintainer: | Man-Suk Oh <[email protected]> |
License: | GPL (>= 2) |
Version: | 2.0 |
Built: | 2024-11-02 06:34:21 UTC |
Source: | CRAN |
bmds
functionCall Shiny to show the results of Bayesian analysis of multidimensional scaling in a web-based application.
bayMDSApp(out)
bayMDSApp(out)
out |
an object of class |
open Shiny app
data(cityDIST) out <- bmds(cityDIST, min_p=1, max_p=6 ) if(interactive()){bayMDSApp(out)}
data(cityDIST) out <- bmds(cityDIST, min_p=1, max_p=6 ) if(interactive()){bayMDSApp(out)}
Provide object configuration and estimates of parameters, for number of dimensions from min_p to max_p
bmds(DIST,min_p=1, max_p=6,nwarm = 1000,niter = 5000,...)
bmds(DIST,min_p=1, max_p=6,nwarm = 1000,niter = 5000,...)
DIST |
symmetric data matrix of dissimilarity measures for pairs of objects |
min_p |
minimum number of dimensions for object configuration (default=1) |
max_p |
maximum number of dimensions for object configuration (default=6) |
nwarm |
number of iterations for burn-in period in MCMC (default=1000) |
niter |
number of MCMC iterations after burn-in period (default=5000) |
... |
arguments to be passed to methods. |
Model
The basic model for Bayesian multidimensional scaling given in Oh and Raftery (2001) is
as follows.
Given the number of dimensions , we assume that an observed dissimilarity measure follows a truncated multivariate normal
distribution with mean equal to Euclidean distance, i.e.,
,
independently for
where
is the number of objects, i.e, numner of rows in DIST
is an observed dissimilarity measure between objects i and j
is the distance between objects i and j in a p-dimensional
Euclidean space, i.e.,
denotes the values of the attributes possessed by object i, i.e., the
coordinates of object i in a p-dimensional Euclidean space.
Priors
Prior distribution of is given as a multivariate normal
distribution with mean 0 and a diagonal covariance matrix
, i.e.,
, independently for
. Note that the zero mean and
diagonal covariance matrix is assumed because Euclidean distance is invariant under
translation and rotation of
.
Prior distribution of the error variance is given as
, the inverse Gamma distribution with mode
.
Hyperpriors for the elements of are given
as
, independently for
.
We assume prior independence among .
Measure of fit
A measure of fit, called STRESS, is defined as
,
where is the Euclidean distance between objects
i and j, computed from the estimated object configuration.
Note that the squared
is proportional to the sum of squared residuals,
.
in bmds
object
number of objects, i.e., number of rows in DIST
minimum number of dimensions
maximum number of dimensions
number of MCMC iterations
number of burn-in in MCMC
the following lists contains objects from bmdsMCMC
for number of dimensions from min_p to max_p
a list of object configurations
a list of minimum sum of squares of residuals between the observed dissimilarities and the estimated Euclidean distances between pairs of objects
a list of the indecies of the iteration corresponding to minimum SSR
a list of STRESS values
a list of posterior mean of
a list of posterior variance of
a list of posterior samples of SSR
a list of posterior samples of elements of
a list of posterior samples of , the error variance
a list of posterior samples of s,Euclidean distances between pairs of objects)
a list of object configuration from the classical multidimensional scaling of Togerson(1952)
a list of outputs from bmdsMCMC founction for each number of dimensions
Oh, M-S., Raftery A.E. (2001). Bayesian Multidimensional Scaling and Choice of Dimension, Journal of the American Statistical Association, 96, 1031-1044.
Torgerson, W.S. (1952). Multidimensional Scaling: I. Theory and Methods, Psychometrika, 17, 401-419.
data(cityDIST) out <- bmds(cityDIST)
data(cityDIST) out <- bmds(cityDIST)
run MCMC algorithm given in Oh and Raftery (2001) and return posterior samples of parameters as well as object configuration and other parameter estimates, for a given number of dimensions p
bmdsMCMC(DIST,p,nwarm = 1000,niter = 5000)
bmdsMCMC(DIST,p,nwarm = 1000,niter = 5000)
DIST |
symmetric matrix of dissimilarity measures between objects |
p |
number of dimensions of object configuration |
nwarm |
number of iterations for burn-in period in MCMC (default=1000) |
niter |
number of MCMC iterations after burn-in period (default=5000) |
A list of MCMC results
n by p matrix of object configuration that minimizes the sum of squares of residuals(SSR), where n is the number of objects, i.e., n=nrow(DIST)
n by p matrix of object configuration from the classical multidimensional scaling of Togerson(1952)
minimum of sum of squares of residuals between the observed dissimilarities and the estimated Euclidean distances for pairs of objects
index of the iteration corresponding to minimum SSR
STRESS computed from minSSR
posterior mean of
posterior variance of
niter dimensional vector of posterior samples of SSR
niter by p matrix of posterior samples of elements of
niter dimensional vector of posterior samples of
niter by matrix of posterior samples of
, p-dimensional Euclidean distances
between pairs of objects
Oh, M-S., Raftery A.E. (2001). Bayesian Multidimensional Scaling and Choice of Dimension, Journal of the American Statistical Association, 96, 1031-1044.
data(cityDIST) result=bmdsMCMC(cityDIST,p=3)
data(cityDIST) result=bmdsMCMC(cityDIST,p=3)
check the type of dissimilarity matrix and convert it to a symmetric full matrix for the input of bmdsMCMC
and bmds
function
checkDIST(dist, ...)
checkDIST(dist, ...)
dist |
dissimilarity measures for pairs of objects |
... |
arguments to be passed to methods |
a full matrix of dissimilarity measures
x <- matrix(rnorm(100), nrow = 5) dist(x) checkDIST(dist(x))
x <- matrix(rnorm(100), nrow = 5) dist(x) checkDIST(dist(x))
Airline distances between 30 principal cities of the world. Cities are located on the surface of the earth, a three-dimensional sphere, and airplanes travel on the surface of the earth.
Hartigan, J.A. (1975), Clustering Algorithms, Wiley, New York.
data(cityDIST)
data(cityDIST)
calculate Euclidean distances between rows of matrix X
distRcpp(X)
distRcpp(X)
X |
data matrix |
distance matrix
x <- matrix(rnorm(100), nrow = 5) distRcpp(x)
x <- matrix(rnorm(100), nrow = 5) distRcpp(x)
compute and plot MDSIC, a Bayesian selection criterion,
given in Oh and Raftery (2001)
based on the output of the function bmds
MDSIC(x, plot = TRUE, ...)
MDSIC(x, plot = TRUE, ...)
x |
an object of class |
plot |
TRUE/FALSE, if TRUE plot the number of dimensions versus MDSIC (default=TRUE) |
... |
arguments to be passed to methods |
Notes
To compute MDSIC, output of the function bmds
for
min_p
=1 is needed for sequential calculation of MDSIC.
a list of MDSIC
results
MDSIC, for p =1,..,max_p
log likelihood term in MDSIC, for p=1,...,max_p
penalty term in MDSIC, for p=1,...,max_p
Oh, M-S., Raftery A.E. (2001). Bayesian Multidimensional Scaling and Choice of Dimension, Journal of the American Statistical Association, 96, 1031-1044.
data(cityDIST) out <- bmds(cityDIST, min_p=1, max_p=5 ) MDSIC(out)
data(cityDIST) out <- bmds(cityDIST, min_p=1, max_p=5 ) MDSIC(out)
plot Delta (estimated Euclidean distance from bmds
) vs DIST (observed dissimilarity measure)
for pairs of objects
plotDelDist(out)
plotDelDist(out)
out |
the output of the function |
plot of delta vs. dist
data(cityDIST) result <- bmdsMCMC(cityDIST,p=3,nwarm=1000,niter=2000) plotDelDist(result)
data(cityDIST) result <- bmdsMCMC(cityDIST,p=3,nwarm=1000,niter=2000) plotDelDist(result)
plot object configuration in a Euclidean space of two selected dimensions
plotObj(out, ...)
plotObj(out, ...)
out |
the output of the function |
... |
arguments to be passed to methods |
plot of object configuration
data(cityDIST) result <- bmdsMCMC(cityDIST,p=3,nwarm=1000,niter=2000) plotObj(result)
data(cityDIST) result <- bmdsMCMC(cityDIST,p=3,nwarm=1000,niter=2000) plotObj(result)
plot trace plots of MCMC samples of parameters for visual inspection of MCMC convergence
plotTrace(out, para = c("del"), linecolor = "blue", ...)
plotTrace(out, para = c("del"), linecolor = "blue", ...)
out |
the output of the function |
para |
names of the parameters for trace plots. It should be any subvector of c("del","sigma", "lambda") (default=c("del")) |
linecolor |
line color. The default color is blue. |
... |
arguments to be passed to methods |
Notes
If "del" is in para, trace plots of the Euclidean distances from 4 randomly selected pairs will be given
If "lambda" is in para, trace plots of the first four elements of Lambda, the diagonal prior variance of objects, will be given
If "sigma" is in para, trace plot and ACF(Auto Correlation Function) plot of sigma, the errorvariance will be given
trace plots of delta, sigma and lambda
data(cityDIST) result <- bmdsMCMC(cityDIST,p=3,nwarm=1000,niter=2000) plotTrace(result,para=c("del","sigma", "lambda"))
data(cityDIST) result <- bmdsMCMC(cityDIST,p=3,nwarm=1000,niter=2000) plotTrace(result,para=c("del","sigma", "lambda"))