bmds {bayMDS} | R Documentation |
Provide object configuration and estimates of parameters, for number of dimensions from min_p to max_p
bmds(DIST,min_p=1, max_p=6,nwarm = 1000,niter = 5000,...)
DIST |
symmetric data matrix of dissimilarity measures for pairs of objects |
min_p |
minimum number of dimensions for object configuration (default=1) |
max_p |
maximum number of dimensions for object configuration (default=6) |
nwarm |
number of iterations for burn-in period in MCMC (default=1000) |
niter |
number of MCMC iterations after burn-in period (default=5000) |
... |
arguments to be passed to methods. |
Model
The basic model for Bayesian multidimensional scaling given in Oh and Raftery (2001) is
as follows.
Given the number of dimensions p
, we assume that an observed dissimilarity measure follows a truncated multivariate normal
distribution with mean equal to Euclidean distance, i.e.,
d_{ij} \sim N ( \delta_{ij}, \sigma^2 )I( d_{ij} > 0)
,
independently for i \ne j, i,j=1, \cdots,n,
where
n
is the number of objects, i.e, numner of rows in DIST
d_{ij}
is an observed dissimilarity measure between objects i and j
\delta_{ij}
is the distance between objects i and j in a p-dimensional
Euclidean space, i.e.,
\delta_{ij} = \sqrt{ \sum_{k=1}^p (x_{ik}-x_{jk})^2 }
x_i=(x_{i1},...,x_{ip})
denotes the values of the attributes possessed by object i, i.e., the
coordinates of object i in a p-dimensional Euclidean space.
Priors
Prior distribution of x_i
is given as a multivariate normal
distribution with mean 0 and a diagonal covariance matrix \Lambda
, i.e.,
x_i \sim N(0,\Lambda)
, independently for i = 1,\cdots,n
. Note that the zero mean and
diagonal covariance matrix is assumed because Euclidean distance is invariant under
translation and rotation of X=\{x_i\}
.
Prior distribution of the error variance \sigma^2
is given as
\sigma^2 \sim IG(a,b)
, the inverse Gamma distribution with mode b/(a+1)
.
Hyperpriors for the elements of \Lambda = diag (\lambda_1,...,\lambda_p)
are given
as \lambda_j \sim IG(\alpha, \beta_j)
, independently for
j=1,\cdots,p
.
We assume prior independence among X, \Lambda,\sigma^2
.
Measure of fit
A measure of fit, called STRESS, is defined as
STRESS =\sqrt{{\sum_{i > j} (d_{ij}-\hat{\delta}_{ij})^2 } \over
{\sum_{i > j} d_{ij}^2 }}
,
where \hat{\delta}_{ij}
is the Euclidean distance between objects
i and j, computed from the estimated object configuration.
Note that the squared STRESS
is proportional to the sum of squared residuals,
SSR=\sum_{i > j} (d_{ij}-\hat{\delta}_{ij})^2
.
in bmds
object
number of objects, i.e., number of rows in DIST
minimum number of dimensions
maximum number of dimensions
number of MCMC iterations
number of burn-in in MCMC
the following lists contains objects from bmdsMCMC
for number of dimensions from min_p to max_p
a list of object configurations
a list of minimum sum of squares of residuals between the observed dissimilarities and the estimated Euclidean distances between pairs of objects
a list of the indecies of the iteration corresponding to minimum SSR
a list of STRESS values
a list of posterior mean of \sigma^2
a list of posterior variance of \sigma^2
a list of posterior samples of SSR
a list of posterior samples of elements of \Lambda
a list of posterior samples of \sigma^2
, the error variance
a list of posterior samples of \delta
s,Euclidean distances between pairs of objects)
a list of object configuration from the classical multidimensional scaling of Togerson(1952)
a list of outputs from bmdsMCMC founction for each number of dimensions
Oh, M-S., Raftery A.E. (2001). Bayesian Multidimensional Scaling and Choice of Dimension, Journal of the American Statistical Association, 96, 1031-1044.
Torgerson, W.S. (1952). Multidimensional Scaling: I. Theory and Methods, Psychometrika, 17, 401-419.
data(cityDIST)
out <- bmds(cityDIST)