Title: | Tools for Computational Optimal Transport |
---|---|
Description: | Transport theory has seen much success in many fields of statistics and machine learning. We provide a variety of algorithms to compute Wasserstein distance, barycenter, and others. See Peyré and Cuturi (2019) <doi:10.1561/2200000073> for the general exposition to the study of computational optimal transport. |
Authors: | Kisung You [aut, cre] |
Maintainer: | Kisung You <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.1.2 |
Built: | 2024-12-11 07:29:07 UTC |
Source: | CRAN |
Given empirical measures
of possibly different cardinalities,
wasserstein barycenter
is the solution to the following problem
where 's are relative weights of empirical measures. Here we assume
either (1) support atoms in Euclidean space are given, or (2) all pairwise distances between
atoms of the fixed support and empirical measures are given.
Algorithmically, it is a subgradient method where the each subgradient is
approximated using the entropic regularization.
bary14C( support, atoms, marginals = NULL, weights = NULL, lambda = 0.1, p = 2, ... ) bary14Cdist( distances, marginals = NULL, weights = NULL, lambda = 0.1, p = 2, ... )
bary14C( support, atoms, marginals = NULL, weights = NULL, lambda = 0.1, p = 2, ... ) bary14Cdist( distances, marginals = NULL, weights = NULL, lambda = 0.1, p = 2, ... )
support |
an |
atoms |
a length- |
marginals |
marginal distribution for empirical measures; if |
weights |
weights for each individual measure; if |
lambda |
regularization parameter (default: 0.1). |
p |
an exponent for the order of the distance (default: 2). |
... |
extra parameters including
|
distances |
a length- |
a length- vector of probability vector.
Cuturi M, Doucet A (2014). “Fast computation of wasserstein barycenters.” In Xing EP, Jebara T (eds.), Proceedings of the 31st international conference on international conference on machine learning - volume 32, volume 32 of Proceedings of machine learning research, 685–693.
#------------------------------------------------------------------- # Wasserstein Barycenter for Fixed Atoms with Two Gaussians # # * class 1 : samples from Gaussian with mean=(-4, -4) # * class 2 : samples from Gaussian with mean=(+4, +4) # * target support consists of 7 integer points from -6 to 6, # where ideally, weight is concentrated near 0 since it's average! #------------------------------------------------------------------- ## GENERATE DATA # Empirical Measures set.seed(100) ndat = 100 dat1 = matrix(rnorm(ndat*2, mean=-4, sd=0.5),ncol=2) dat2 = matrix(rnorm(ndat*2, mean=+4, sd=0.5),ncol=2) myatoms = list() myatoms[[1]] = dat1 myatoms[[2]] = dat2 mydata = rbind(dat1, dat2) # Fixed Support support = cbind(seq(from=-8,to=8,by=2), seq(from=-8,to=8,by=2)) ## COMPUTE comp1 = bary14C(support, myatoms, lambda=0.5, maxiter=10) comp2 = bary14C(support, myatoms, lambda=1, maxiter=10) comp3 = bary14C(support, myatoms, lambda=5, maxiter=10) ## VISUALIZE opar <- par(no.readonly=TRUE) par(mfrow=c(1,3)) barplot(comp1, main="lambda=0.5") barplot(comp2, main="lambda=1") barplot(comp3, main="lambda=5") par(opar)
#------------------------------------------------------------------- # Wasserstein Barycenter for Fixed Atoms with Two Gaussians # # * class 1 : samples from Gaussian with mean=(-4, -4) # * class 2 : samples from Gaussian with mean=(+4, +4) # * target support consists of 7 integer points from -6 to 6, # where ideally, weight is concentrated near 0 since it's average! #------------------------------------------------------------------- ## GENERATE DATA # Empirical Measures set.seed(100) ndat = 100 dat1 = matrix(rnorm(ndat*2, mean=-4, sd=0.5),ncol=2) dat2 = matrix(rnorm(ndat*2, mean=+4, sd=0.5),ncol=2) myatoms = list() myatoms[[1]] = dat1 myatoms[[2]] = dat2 mydata = rbind(dat1, dat2) # Fixed Support support = cbind(seq(from=-8,to=8,by=2), seq(from=-8,to=8,by=2)) ## COMPUTE comp1 = bary14C(support, myatoms, lambda=0.5, maxiter=10) comp2 = bary14C(support, myatoms, lambda=1, maxiter=10) comp3 = bary14C(support, myatoms, lambda=5, maxiter=10) ## VISUALIZE opar <- par(no.readonly=TRUE) par(mfrow=c(1,3)) barplot(comp1, main="lambda=0.5") barplot(comp2, main="lambda=1") barplot(comp3, main="lambda=5") par(opar)
Given empirical measures
of possibly different cardinalities,
wasserstein barycenter
is the solution to the following problem
where 's are relative weights of empirical measures. Here we assume
either (1) support atoms in Euclidean space are given, or (2) all pairwise distances between
atoms of the fixed support and empirical measures are given.
Authors proposed iterative Bregman projections in conjunction with entropic regularization.
bary15B( support, atoms, marginals = NULL, weights = NULL, lambda = 0.1, p = 2, ... ) bary15Bdist( distances, marginals = NULL, weights = NULL, lambda = 0.1, p = 2, ... )
bary15B( support, atoms, marginals = NULL, weights = NULL, lambda = 0.1, p = 2, ... ) bary15Bdist( distances, marginals = NULL, weights = NULL, lambda = 0.1, p = 2, ... )
support |
an |
atoms |
a length- |
marginals |
marginal distribution for empirical measures; if |
weights |
weights for each individual measure; if |
lambda |
regularization parameter (default: 0.1). |
p |
an exponent for the order of the distance (default: 2). |
... |
extra parameters including
|
distances |
a length- |
a length- vector of probability vector.
Benamou J, Carlier G, Cuturi M, Nenna L, Peyré G (2015). “Iterative Bregman Projections for Regularized Transportation Problems.” SIAM Journal on Scientific Computing, 37(2), A1111–A1138. ISSN 1064-8275, 1095-7197.
#------------------------------------------------------------------- # Wasserstein Barycenter for Fixed Atoms with Two Gaussians # # * class 1 : samples from Gaussian with mean=(-4, -4) # * class 2 : samples from Gaussian with mean=(+4, +4) # * target support consists of 7 integer points from -6 to 6, # where ideally, weight is concentrated near 0 since it's average! #------------------------------------------------------------------- ## GENERATE DATA # Empirical Measures set.seed(100) ndat = 500 dat1 = matrix(rnorm(ndat*2, mean=-4, sd=0.5),ncol=2) dat2 = matrix(rnorm(ndat*2, mean=+4, sd=0.5),ncol=2) myatoms = list() myatoms[[1]] = dat1 myatoms[[2]] = dat2 mydata = rbind(dat1, dat2) # Fixed Support support = cbind(seq(from=-8,to=8,by=2), seq(from=-8,to=8,by=2)) ## COMPUTE comp1 = bary15B(support, myatoms, lambda=0.5, maxiter=10) comp2 = bary15B(support, myatoms, lambda=1, maxiter=10) comp3 = bary15B(support, myatoms, lambda=5, maxiter=10) ## VISUALIZE opar <- par(no.readonly=TRUE) par(mfrow=c(1,3)) barplot(comp1, main="lambda=0.5") barplot(comp2, main="lambda=1") barplot(comp3, main="lambda=5") par(opar)
#------------------------------------------------------------------- # Wasserstein Barycenter for Fixed Atoms with Two Gaussians # # * class 1 : samples from Gaussian with mean=(-4, -4) # * class 2 : samples from Gaussian with mean=(+4, +4) # * target support consists of 7 integer points from -6 to 6, # where ideally, weight is concentrated near 0 since it's average! #------------------------------------------------------------------- ## GENERATE DATA # Empirical Measures set.seed(100) ndat = 500 dat1 = matrix(rnorm(ndat*2, mean=-4, sd=0.5),ncol=2) dat2 = matrix(rnorm(ndat*2, mean=+4, sd=0.5),ncol=2) myatoms = list() myatoms[[1]] = dat1 myatoms[[2]] = dat2 mydata = rbind(dat1, dat2) # Fixed Support support = cbind(seq(from=-8,to=8,by=2), seq(from=-8,to=8,by=2)) ## COMPUTE comp1 = bary15B(support, myatoms, lambda=0.5, maxiter=10) comp2 = bary15B(support, myatoms, lambda=1, maxiter=10) comp3 = bary15B(support, myatoms, lambda=5, maxiter=10) ## VISUALIZE opar <- par(no.readonly=TRUE) par(mfrow=c(1,3)) barplot(comp1, main="lambda=0.5") barplot(comp2, main="lambda=1") barplot(comp3, main="lambda=5") par(opar)
digit3
contains 2000 images from the famous MNIST dataset of digit 3.
Each element of the list is an image represented as an
matrix that sums to 1. This normalization is conventional and it does not
hurt its visualization via a basic
image()
function.
data(digit3)
data(digit3)
a length- named list
"digit3"
of matrices.
## LOAD THE DATA data(digit3) ## SHOW A FEW opar <- par(no.readonly=TRUE) par(mfrow=c(2,4), pty="s") for (i in 1:8){ image(digit3[[i]]) } par(opar)
## LOAD THE DATA data(digit3) ## SHOW A FEW opar <- par(no.readonly=TRUE) par(mfrow=c(2,4), pty="s") for (i in 1:8){ image(digit3[[i]]) } par(opar)
digits
contains 5000 images from the famous MNIST dataset of all digits,
consisting of 500 images per digit class from 0 to 9.
Each digit image is represented as an
matrix that sums to 1. This normalization is conventional and it does not
hurt its visualization via a basic
image()
function.
data(digits)
data(digits)
a named list "digits"
containing
length-5000 list of image matrices.
length-5000 vector of class labels from 0 to 9.
## LOAD THE DATA data(digits) ## SHOW A FEW # Select 9 random images subimgs = digits$image[sample(1:5000, 9)] opar <- par(no.readonly=TRUE) par(mfrow=c(3,3), pty="s") for (i in 1:9){ image(subimgs[[i]]) } par(opar)
## LOAD THE DATA data(digits) ## SHOW A FEW # Select 9 random images subimgs = digits$image[sample(1:5000, 9)] opar <- par(no.readonly=TRUE) par(mfrow=c(3,3), pty="s") for (i in 1:9){ image(subimgs[[i]]) } par(opar)
Given a collection of empirical cumulative distribution functions
for
, compute the Wasserstein barycenter
of order 2. This is obtained by taking a weighted average on a set of
corresponding quantile functions.
ecdfbary(ecdfs, weights = NULL, ...)
ecdfbary(ecdfs, weights = NULL, ...)
ecdfs |
a length- |
weights |
a weight of each image; if |
... |
extra parameters including
|
an "ecdf"
object of the Wasserstein barycenter.
#---------------------------------------------------------------------- # Two Gaussians # # Two Gaussian distributions are parametrized as follows. # Type 1 : (mean, var) = (-4, 1/4) # Type 2 : (mean, var) = (+4, 1/4) #---------------------------------------------------------------------- # GENERATE ECDFs ecdf_list = list() ecdf_list[[1]] = stats::ecdf(stats::rnorm(200, mean=-4, sd=0.5)) ecdf_list[[2]] = stats::ecdf(stats::rnorm(200, mean=+4, sd=0.5)) # COMPUTE THE BARYCENTER OF EQUAL WEIGHTS emean = ecdfbary(ecdf_list) # QUANTITIES FOR PLOTTING x_grid = seq(from=-8, to=8, length.out=100) y_type1 = ecdf_list[[1]](x_grid) y_type2 = ecdf_list[[2]](x_grid) y_bary = emean(x_grid) # VISUALIZE opar <- par(no.readonly=TRUE) plot(x_grid, y_bary, lwd=3, col="red", type="l", main="Barycenter", xlab="x", ylab="Fn(x)") lines(x_grid, y_type1, col="gray50", lty=3) lines(x_grid, y_type2, col="gray50", lty=3) par(opar)
#---------------------------------------------------------------------- # Two Gaussians # # Two Gaussian distributions are parametrized as follows. # Type 1 : (mean, var) = (-4, 1/4) # Type 2 : (mean, var) = (+4, 1/4) #---------------------------------------------------------------------- # GENERATE ECDFs ecdf_list = list() ecdf_list[[1]] = stats::ecdf(stats::rnorm(200, mean=-4, sd=0.5)) ecdf_list[[2]] = stats::ecdf(stats::rnorm(200, mean=+4, sd=0.5)) # COMPUTE THE BARYCENTER OF EQUAL WEIGHTS emean = ecdfbary(ecdf_list) # QUANTITIES FOR PLOTTING x_grid = seq(from=-8, to=8, length.out=100) y_type1 = ecdf_list[[1]](x_grid) y_type2 = ecdf_list[[2]](x_grid) y_bary = emean(x_grid) # VISUALIZE opar <- par(no.readonly=TRUE) plot(x_grid, y_bary, lwd=3, col="red", type="l", main="Barycenter", xlab="x", ylab="Fn(x)") lines(x_grid, y_type1, col="gray50", lty=3) lines(x_grid, y_type2, col="gray50", lty=3) par(opar)
Given a collection of empirical cumulative distribution functions
for
, compute the Wasserstein median. This is
obtained by a functional variant of the Weiszfeld algorithm on a set of
quantile functions.
ecdfmed(ecdfs, weights = NULL, ...)
ecdfmed(ecdfs, weights = NULL, ...)
ecdfs |
a length- |
weights |
a weight of each image; if |
... |
extra parameters including
|
an "ecdf"
object of the Wasserstein median.
#---------------------------------------------------------------------- # Tree Gaussians # # Three Gaussian distributions are parametrized as follows. # Type 1 : (mean, sd) = (-4, 1) # Type 2 : (mean, sd) = ( 0, 1/5) # Type 3 : (mean, sd) = (+6, 1/2) #---------------------------------------------------------------------- # GENERATE ECDFs ecdf_list = list() ecdf_list[[1]] = stats::ecdf(stats::rnorm(200, mean=-4, sd=1)) ecdf_list[[2]] = stats::ecdf(stats::rnorm(200, mean=+4, sd=0.2)) ecdf_list[[3]] = stats::ecdf(stats::rnorm(200, mean=+6, sd=0.5)) # COMPUTE THE MEDIAN emeds = ecdfmed(ecdf_list) # COMPUTE THE BARYCENTER emean = ecdfbary(ecdf_list) # QUANTITIES FOR PLOTTING x_grid = seq(from=-8, to=10, length.out=500) y_type1 = ecdf_list[[1]](x_grid) y_type2 = ecdf_list[[2]](x_grid) y_type3 = ecdf_list[[3]](x_grid) y_bary = emean(x_grid) y_meds = emeds(x_grid) # VISUALIZE opar <- par(no.readonly=TRUE) plot(x_grid, y_bary, lwd=3, col="orange", type="l", main="Wasserstein Median & Barycenter", xlab="x", ylab="Fn(x)", lty=2) lines(x_grid, y_meds, lwd=3, col="blue", lty=2) lines(x_grid, y_type1, col="gray50", lty=3) lines(x_grid, y_type2, col="gray50", lty=3) lines(x_grid, y_type3, col="gray50", lty=3) legend("topleft", legend=c("Median","Barycenter"), lwd=3, lty=2, col=c("blue","orange")) par(opar)
#---------------------------------------------------------------------- # Tree Gaussians # # Three Gaussian distributions are parametrized as follows. # Type 1 : (mean, sd) = (-4, 1) # Type 2 : (mean, sd) = ( 0, 1/5) # Type 3 : (mean, sd) = (+6, 1/2) #---------------------------------------------------------------------- # GENERATE ECDFs ecdf_list = list() ecdf_list[[1]] = stats::ecdf(stats::rnorm(200, mean=-4, sd=1)) ecdf_list[[2]] = stats::ecdf(stats::rnorm(200, mean=+4, sd=0.2)) ecdf_list[[3]] = stats::ecdf(stats::rnorm(200, mean=+6, sd=0.5)) # COMPUTE THE MEDIAN emeds = ecdfmed(ecdf_list) # COMPUTE THE BARYCENTER emean = ecdfbary(ecdf_list) # QUANTITIES FOR PLOTTING x_grid = seq(from=-8, to=10, length.out=500) y_type1 = ecdf_list[[1]](x_grid) y_type2 = ecdf_list[[2]](x_grid) y_type3 = ecdf_list[[3]](x_grid) y_bary = emean(x_grid) y_meds = emeds(x_grid) # VISUALIZE opar <- par(no.readonly=TRUE) plot(x_grid, y_bary, lwd=3, col="orange", type="l", main="Wasserstein Median & Barycenter", xlab="x", ylab="Fn(x)", lty=2) lines(x_grid, y_meds, lwd=3, col="blue", lty=2) lines(x_grid, y_type1, col="gray50", lty=3) lines(x_grid, y_type2, col="gray50", lty=3) lines(x_grid, y_type3, col="gray50", lty=3) legend("topleft", legend=c("Median","Barycenter"), lwd=3, lty=2, col=c("blue","orange")) par(opar)
Given a collection of Gaussian distributions for
,
compute the Wasserstein barycenter of order 2. For the barycenter computation of
variance components, we use a fixed-point algorithm by Álvarez-Esteban et al. (2016).
gaussbary1d(means, vars, weights = NULL, ...)
gaussbary1d(means, vars, weights = NULL, ...)
means |
a length- |
vars |
a length- |
weights |
a weight of each image; if |
... |
extra parameters including
|
a named list containing
mean of the estimated barycenter distribution.
variance of the estimated barycenter distribution.
Álvarez-Esteban PC, del Barrio E, Cuesta-Albertos JA, Matrán C (2016). “A Fixed-Point Approach to Barycenters in Wasserstein Space.” Journal of Mathematical Analysis and Applications, 441(2), 744–762. ISSN 0022247X.
gaussbarypd()
for multivariate case.
#---------------------------------------------------------------------- # Two Gaussians # # Two Gaussian distributions are parametrized as follows. # Type 1 : (mean, var) = (-4, 1/4) # Type 2 : (mean, var) = (+4, 1/4) #---------------------------------------------------------------------- # GENERATE PARAMETERS par_mean = c(-4, 4) par_vars = c(0.25, 0.25) # COMPUTE THE BARYCENTER OF EQUAL WEIGHTS gmean = gaussbary1d(par_mean, par_vars) # QUANTITIES FOR PLOTTING x_grid = seq(from=-6, to=6, length.out=200) y_dist1 = stats::dnorm(x_grid, mean=-4, sd=0.5) y_dist2 = stats::dnorm(x_grid, mean=+4, sd=0.5) y_gmean = stats::dnorm(x_grid, mean=gmean$mean, sd=sqrt(gmean$var)) # VISUALIZE opar <- par(no.readonly=TRUE) plot(x_grid, y_gmean, lwd=2, col="red", type="l", main="Barycenter", xlab="x", ylab="density") lines(x_grid, y_dist1) lines(x_grid, y_dist2) par(opar)
#---------------------------------------------------------------------- # Two Gaussians # # Two Gaussian distributions are parametrized as follows. # Type 1 : (mean, var) = (-4, 1/4) # Type 2 : (mean, var) = (+4, 1/4) #---------------------------------------------------------------------- # GENERATE PARAMETERS par_mean = c(-4, 4) par_vars = c(0.25, 0.25) # COMPUTE THE BARYCENTER OF EQUAL WEIGHTS gmean = gaussbary1d(par_mean, par_vars) # QUANTITIES FOR PLOTTING x_grid = seq(from=-6, to=6, length.out=200) y_dist1 = stats::dnorm(x_grid, mean=-4, sd=0.5) y_dist2 = stats::dnorm(x_grid, mean=+4, sd=0.5) y_gmean = stats::dnorm(x_grid, mean=gmean$mean, sd=sqrt(gmean$var)) # VISUALIZE opar <- par(no.readonly=TRUE) plot(x_grid, y_gmean, lwd=2, col="red", type="l", main="Barycenter", xlab="x", ylab="density") lines(x_grid, y_dist1) lines(x_grid, y_dist2) par(opar)
Given a collection of -dimensional Gaussian distributions
for
, compute the Wasserstein barycenter of order 2.
For the barycenter computation of variance components, we use a fixed-point
algorithm by Álvarez-Esteban et al. (2016).
gaussbarypd(means, vars, weights = NULL, ...)
gaussbarypd(means, vars, weights = NULL, ...)
means |
an |
vars |
a |
weights |
a weight of each image; if |
... |
extra parameters including
|
a named list containing
a length- vector for mean of the estimated barycenter distribution.
a matrix for variance of the estimated barycenter distribution.
Álvarez-Esteban PC, del Barrio E, Cuesta-Albertos JA, Matrán C (2016). “A Fixed-Point Approach to Barycenters in Wasserstein Space.” Journal of Mathematical Analysis and Applications, 441(2), 744–762. ISSN 0022247X.
gaussbary1d()
for univariate case.
#---------------------------------------------------------------------- # Two Gaussians in R^2 #---------------------------------------------------------------------- # GENERATE PARAMETERS # means par_mean = rbind(c(-4,0), c(4,0)) # covariances par_vars = array(0,c(2,2,2)) par_vars[,,1] = cbind(c(4,-2),c(-2,4)) par_vars[,,2] = cbind(c(4,+2),c(+2,4)) # COMPUTE THE BARYCENTER OF EQUAL WEIGHTS gmean = gaussbarypd(par_mean, par_vars) # GET COORDINATES FOR DRAWING pt_type1 = gaussvis2d(par_mean[1,], par_vars[,,1]) pt_type2 = gaussvis2d(par_mean[2,], par_vars[,,2]) pt_gmean = gaussvis2d(gmean$mean, gmean$var) # VISUALIZE opar <- par(no.readonly=TRUE) plot(pt_gmean, lwd=2, col="red", type="l", main="Barycenter", xlab="", ylab="", xlim=c(-6,6)) lines(pt_type1) lines(pt_type2) par(opar)
#---------------------------------------------------------------------- # Two Gaussians in R^2 #---------------------------------------------------------------------- # GENERATE PARAMETERS # means par_mean = rbind(c(-4,0), c(4,0)) # covariances par_vars = array(0,c(2,2,2)) par_vars[,,1] = cbind(c(4,-2),c(-2,4)) par_vars[,,2] = cbind(c(4,+2),c(+2,4)) # COMPUTE THE BARYCENTER OF EQUAL WEIGHTS gmean = gaussbarypd(par_mean, par_vars) # GET COORDINATES FOR DRAWING pt_type1 = gaussvis2d(par_mean[1,], par_vars[,,1]) pt_type2 = gaussvis2d(par_mean[2,], par_vars[,,2]) pt_gmean = gaussvis2d(gmean$mean, gmean$var) # VISUALIZE opar <- par(no.readonly=TRUE) plot(pt_gmean, lwd=2, col="red", type="l", main="Barycenter", xlab="", ylab="", xlim=c(-6,6)) lines(pt_type1) lines(pt_type2) par(opar)
Given a collection of Gaussian distributions for
,
compute the Wasserstein median.
gaussmed1d(means, vars, weights = NULL, ...)
gaussmed1d(means, vars, weights = NULL, ...)
means |
a length- |
vars |
a length- |
weights |
a weight of each image; if |
... |
extra parameters including
|
a named list containing
mean of the estimated median distribution.
variance of the estimated median distribution.
gaussmedpd()
for multivariate case.
#---------------------------------------------------------------------- # Tree Gaussians # # Three Gaussian distributions are parametrized as follows. # Type 1 : (mean, sd) = (-4, 1) # Type 2 : (mean, sd) = ( 0, 1/5) # Type 3 : (mean, sd) = (+6, 1/2) #---------------------------------------------------------------------- # GENERATE PARAMETERS par_mean = c(-4, 0, +6) par_vars = c(1, 0.04, 0.25) # COMPUTE THE WASSERSTEIN MEDIAN gmeds = gaussmed1d(par_mean, par_vars) # COMPUTE THE BARYCENTER gmean = gaussbary1d(par_mean, par_vars) # QUANTITIES FOR PLOTTING x_grid = seq(from=-6, to=8, length.out=1000) y_dist1 = stats::dnorm(x_grid, mean=par_mean[1], sd=sqrt(par_vars[1])) y_dist2 = stats::dnorm(x_grid, mean=par_mean[2], sd=sqrt(par_vars[2])) y_dist3 = stats::dnorm(x_grid, mean=par_mean[3], sd=sqrt(par_vars[3])) y_gmean = stats::dnorm(x_grid, mean=gmean$mean, sd=sqrt(gmean$var)) y_gmeds = stats::dnorm(x_grid, mean=gmeds$mean, sd=sqrt(gmeds$var)) # VISUALIZE opar <- par(no.readonly=TRUE) plot(x_grid, y_gmeds, lwd=3, col="red", type="l", main="Three Gaussians", xlab="x", ylab="density", xlim=range(x_grid), ylim=c(0,2.5)) lines(x_grid, y_gmean, lwd=3, col="blue") lines(x_grid, y_dist1, lwd=1.5, lty=2) lines(x_grid, y_dist2, lwd=1.5, lty=2) lines(x_grid, y_dist3, lwd=1.5, lty=2) legend("topleft", legend=c("Median","Barycenter"), col=c("red","blue"), lwd=c(3,3), lty=c(1,2)) par(opar)
#---------------------------------------------------------------------- # Tree Gaussians # # Three Gaussian distributions are parametrized as follows. # Type 1 : (mean, sd) = (-4, 1) # Type 2 : (mean, sd) = ( 0, 1/5) # Type 3 : (mean, sd) = (+6, 1/2) #---------------------------------------------------------------------- # GENERATE PARAMETERS par_mean = c(-4, 0, +6) par_vars = c(1, 0.04, 0.25) # COMPUTE THE WASSERSTEIN MEDIAN gmeds = gaussmed1d(par_mean, par_vars) # COMPUTE THE BARYCENTER gmean = gaussbary1d(par_mean, par_vars) # QUANTITIES FOR PLOTTING x_grid = seq(from=-6, to=8, length.out=1000) y_dist1 = stats::dnorm(x_grid, mean=par_mean[1], sd=sqrt(par_vars[1])) y_dist2 = stats::dnorm(x_grid, mean=par_mean[2], sd=sqrt(par_vars[2])) y_dist3 = stats::dnorm(x_grid, mean=par_mean[3], sd=sqrt(par_vars[3])) y_gmean = stats::dnorm(x_grid, mean=gmean$mean, sd=sqrt(gmean$var)) y_gmeds = stats::dnorm(x_grid, mean=gmeds$mean, sd=sqrt(gmeds$var)) # VISUALIZE opar <- par(no.readonly=TRUE) plot(x_grid, y_gmeds, lwd=3, col="red", type="l", main="Three Gaussians", xlab="x", ylab="density", xlim=range(x_grid), ylim=c(0,2.5)) lines(x_grid, y_gmean, lwd=3, col="blue") lines(x_grid, y_dist1, lwd=1.5, lty=2) lines(x_grid, y_dist2, lwd=1.5, lty=2) lines(x_grid, y_dist3, lwd=1.5, lty=2) legend("topleft", legend=c("Median","Barycenter"), col=c("red","blue"), lwd=c(3,3), lty=c(1,2)) par(opar)
Given a collection of -dimensional Gaussian distributions
for
,
compute the Wasserstein median.
gaussmedpd(means, vars, weights = NULL, ...)
gaussmedpd(means, vars, weights = NULL, ...)
means |
an |
vars |
a |
weights |
a weight of each image; if |
... |
extra parameters including
|
a named list containing
a length- vector for mean of the estimated median distribution.
a matrix for variance of the estimated median distribution.
gaussmed1d()
for univariate case.
#---------------------------------------------------------------------- # Three Gaussians in R^2 #---------------------------------------------------------------------- # GENERATE PARAMETERS # means par_mean = rbind(c(-4,0), c(0,0), c(5,-1)) # covariances par_vars = array(0,c(2,2,3)) par_vars[,,1] = cbind(c(2,-1),c(-1,2)) par_vars[,,2] = cbind(c(4,+1),c(+1,4)) par_vars[,,3] = diag(c(4,1)) # COMPUTE THE MEDIAN gmeds = gaussmedpd(par_mean, par_vars) # COMPUTE THE BARYCENTER gmean = gaussbarypd(par_mean, par_vars) # GET COORDINATES FOR DRAWING pt_type1 = gaussvis2d(par_mean[1,], par_vars[,,1]) pt_type2 = gaussvis2d(par_mean[2,], par_vars[,,2]) pt_type3 = gaussvis2d(par_mean[3,], par_vars[,,3]) pt_gmean = gaussvis2d(gmean$mean, gmean$var) pt_gmeds = gaussvis2d(gmeds$mean, gmeds$var) # VISUALIZE opar <- par(no.readonly=TRUE) plot(pt_gmean, lwd=2, col="red", type="l", main="Three Gaussians", xlab="", ylab="", xlim=c(-6,8), ylim=c(-2.5,2.5)) lines(pt_gmeds, lwd=2, col="blue") lines(pt_type1, lty=2, lwd=5) lines(pt_type2, lty=2, lwd=5) lines(pt_type3, lty=2, lwd=5) abline(h=0, col="grey80", lty=3) abline(v=0, col="grey80", lty=3) legend("topright", legend=c("Median","Barycenter"), lwd=2, lty=1, col=c("blue","red")) par(opar)
#---------------------------------------------------------------------- # Three Gaussians in R^2 #---------------------------------------------------------------------- # GENERATE PARAMETERS # means par_mean = rbind(c(-4,0), c(0,0), c(5,-1)) # covariances par_vars = array(0,c(2,2,3)) par_vars[,,1] = cbind(c(2,-1),c(-1,2)) par_vars[,,2] = cbind(c(4,+1),c(+1,4)) par_vars[,,3] = diag(c(4,1)) # COMPUTE THE MEDIAN gmeds = gaussmedpd(par_mean, par_vars) # COMPUTE THE BARYCENTER gmean = gaussbarypd(par_mean, par_vars) # GET COORDINATES FOR DRAWING pt_type1 = gaussvis2d(par_mean[1,], par_vars[,,1]) pt_type2 = gaussvis2d(par_mean[2,], par_vars[,,2]) pt_type3 = gaussvis2d(par_mean[3,], par_vars[,,3]) pt_gmean = gaussvis2d(gmean$mean, gmean$var) pt_gmeds = gaussvis2d(gmeds$mean, gmeds$var) # VISUALIZE opar <- par(no.readonly=TRUE) plot(pt_gmean, lwd=2, col="red", type="l", main="Three Gaussians", xlab="", ylab="", xlim=c(-6,8), ylim=c(-2.5,2.5)) lines(pt_gmeds, lwd=2, col="blue") lines(pt_type1, lty=2, lwd=5) lines(pt_type2, lty=2, lwd=5) lines(pt_type3, lty=2, lwd=5) abline(h=0, col="grey80", lty=3) abline(v=0, col="grey80", lty=3) legend("topright", legend=c("Median","Barycenter"), lwd=2, lty=1, col=c("blue","red")) par(opar)
This function samples points along the contour of an ellipse represented
by mean and variance parameters for a 2-dimensional Gaussian distribution
to help ease manipulating visualization of the specified distribution. For example,
you can directly use a basic plot()
function directly for drawing.
gaussvis2d(mean, var, n = 500)
gaussvis2d(mean, var, n = 500)
mean |
a length- |
var |
a |
n |
the number of points to be drawn (default: 500). |
an matrix.
#---------------------------------------------------------------------- # Three Gaussians in R^2 #---------------------------------------------------------------------- # MEAN PARAMETERS loc1 = c(-3,0) loc2 = c(0,5) loc3 = c(3,0) # COVARIANCE PARAMETERS var1 = cbind(c(4,-2),c(-2,4)) var2 = diag(c(9,1)) var3 = cbind(c(4,2),c(2,4)) # GENERATE POINTS visA = gaussvis2d(loc1, var1) visB = gaussvis2d(loc2, var2) visC = gaussvis2d(loc3, var3) # VISUALIZE opar <- par(no.readonly=TRUE) plot(visA[,1], visA[,2], type="l", xlim=c(-5,5), ylim=c(-2,9), lwd=3, col="red", main="3 Gaussian Distributions") lines(visB[,1], visB[,2], lwd=3, col="blue") lines(visC[,1], visC[,2], lwd=3, col="orange") legend("top", legend=c("Type 1","Type 2","Type 3"), lwd=3, col=c("red","blue","orange"), horiz=TRUE) par(opar)
#---------------------------------------------------------------------- # Three Gaussians in R^2 #---------------------------------------------------------------------- # MEAN PARAMETERS loc1 = c(-3,0) loc2 = c(0,5) loc3 = c(3,0) # COVARIANCE PARAMETERS var1 = cbind(c(4,-2),c(-2,4)) var2 = diag(c(9,1)) var3 = cbind(c(4,2),c(2,4)) # GENERATE POINTS visA = gaussvis2d(loc1, var1) visB = gaussvis2d(loc2, var2) visC = gaussvis2d(loc3, var3) # VISUALIZE opar <- par(no.readonly=TRUE) plot(visA[,1], visA[,2], type="l", xlim=c(-5,5), ylim=c(-2,9), lwd=3, col="red", main="3 Gaussian Distributions") lines(visB[,1], visB[,2], lwd=3, col="blue") lines(visC[,1], visC[,2], lwd=3, col="orange") legend("top", legend=c("Type 1","Type 2","Type 3"), lwd=3, col=c("red","blue","orange"), horiz=TRUE) par(opar)
Given multiple histograms represented as "histogram"
S3 objects, compute
Wasserstein barycenter. We need one requirement that all histograms in an
input list hists
must have same breaks. See the example on how to
construct a histogram on predefined breaks/bins.
histbary14C(hists, p = 2, weights = NULL, lambda = NULL, ...)
histbary14C(hists, p = 2, weights = NULL, lambda = NULL, ...)
hists |
a length- |
p |
an exponent for the order of the distance (default: 2). |
weights |
a weight of each image; if |
lambda |
a regularization parameter; if |
... |
extra parameters including
|
a "histogram"
object of barycenter.
Cuturi M, Doucet A (2014). “Fast computation of wasserstein barycenters.” In Xing EP, Jebara T (eds.), Proceedings of the 31st international conference on international conference on machine learning - volume 32, volume 32 of Proceedings of machine learning research, 685–693.
#---------------------------------------------------------------------- # Binned from Two Gaussians # # EXAMPLE : Very Small Example for CRAN; just showing how to use it! #---------------------------------------------------------------------- # GENERATE FROM TWO GAUSSIANS WITH DIFFERENT MEANS set.seed(100) x = stats::rnorm(1000, mean=-4, sd=0.5) y = stats::rnorm(1000, mean=+4, sd=0.5) bk = seq(from=-10, to=10, length.out=20) # HISTOGRAMS WITH COMMON BREAKS histxy = list() histxy[[1]] = hist(x, breaks=bk, plot=FALSE) histxy[[2]] = hist(y, breaks=bk, plot=FALSE) # COMPUTE hh = histbary14C(histxy, maxiter=5) # VISUALIZE opar <- par(no.readonly=TRUE) par(mfrow=c(1,2)) barplot(histxy[[1]]$density, col=rgb(0,0,1,1/4), ylim=c(0, 0.75), main="Two Histograms") barplot(histxy[[2]]$density, col=rgb(1,0,0,1/4), ylim=c(0, 0.75), add=TRUE) barplot(hh$density, main="Barycenter", ylim=c(0, 0.75)) par(opar)
#---------------------------------------------------------------------- # Binned from Two Gaussians # # EXAMPLE : Very Small Example for CRAN; just showing how to use it! #---------------------------------------------------------------------- # GENERATE FROM TWO GAUSSIANS WITH DIFFERENT MEANS set.seed(100) x = stats::rnorm(1000, mean=-4, sd=0.5) y = stats::rnorm(1000, mean=+4, sd=0.5) bk = seq(from=-10, to=10, length.out=20) # HISTOGRAMS WITH COMMON BREAKS histxy = list() histxy[[1]] = hist(x, breaks=bk, plot=FALSE) histxy[[2]] = hist(y, breaks=bk, plot=FALSE) # COMPUTE hh = histbary14C(histxy, maxiter=5) # VISUALIZE opar <- par(no.readonly=TRUE) par(mfrow=c(1,2)) barplot(histxy[[1]]$density, col=rgb(0,0,1,1/4), ylim=c(0, 0.75), main="Two Histograms") barplot(histxy[[2]]$density, col=rgb(1,0,0,1/4), ylim=c(0, 0.75), add=TRUE) barplot(hh$density, main="Barycenter", ylim=c(0, 0.75)) par(opar)
Given multiple histograms represented as "histogram"
S3 objects, compute
Wasserstein barycenter. We need one requirement that all histograms in an
input list hists
must have same breaks. See the example on how to
construct a histogram on predefined breaks/bins.
histbary15B(hists, p = 2, weights = NULL, lambda = NULL, ...)
histbary15B(hists, p = 2, weights = NULL, lambda = NULL, ...)
hists |
a length- |
p |
an exponent for the order of the distance (default: 2). |
weights |
a weight of each image; if |
lambda |
a regularization parameter; if |
... |
extra parameters including
|
a "histogram"
object of barycenter.
Benamou J, Carlier G, Cuturi M, Nenna L, Peyré G (2015). “Iterative Bregman Projections for Regularized Transportation Problems.” SIAM Journal on Scientific Computing, 37(2), A1111–A1138. ISSN 1064-8275, 1095-7197.
#---------------------------------------------------------------------- # Binned from Two Gaussians # # EXAMPLE : Very Small Example for CRAN; just showing how to use it! #---------------------------------------------------------------------- # GENERATE FROM TWO GAUSSIANS WITH DIFFERENT MEANS set.seed(100) x = stats::rnorm(1000, mean=-4, sd=0.5) y = stats::rnorm(1000, mean=+4, sd=0.5) bk = seq(from=-10, to=10, length.out=20) # HISTOGRAMS WITH COMMON BREAKS histxy = list() histxy[[1]] = hist(x, breaks=bk, plot=FALSE) histxy[[2]] = hist(y, breaks=bk, plot=FALSE) # COMPUTE hh = histbary15B(histxy, maxiter=5) # VISUALIZE opar <- par(no.readonly=TRUE) par(mfrow=c(1,2)) barplot(histxy[[1]]$density, col=rgb(0,0,1,1/4), ylim=c(0, 0.75), main="Two Histograms") barplot(histxy[[2]]$density, col=rgb(1,0,0,1/4), ylim=c(0, 0.75), add=TRUE) barplot(hh$density, main="Barycenter", ylim=c(0, 0.75)) par(opar)
#---------------------------------------------------------------------- # Binned from Two Gaussians # # EXAMPLE : Very Small Example for CRAN; just showing how to use it! #---------------------------------------------------------------------- # GENERATE FROM TWO GAUSSIANS WITH DIFFERENT MEANS set.seed(100) x = stats::rnorm(1000, mean=-4, sd=0.5) y = stats::rnorm(1000, mean=+4, sd=0.5) bk = seq(from=-10, to=10, length.out=20) # HISTOGRAMS WITH COMMON BREAKS histxy = list() histxy[[1]] = hist(x, breaks=bk, plot=FALSE) histxy[[2]] = hist(y, breaks=bk, plot=FALSE) # COMPUTE hh = histbary15B(histxy, maxiter=5) # VISUALIZE opar <- par(no.readonly=TRUE) par(mfrow=c(1,2)) barplot(histxy[[1]]$density, col=rgb(0,0,1,1/4), ylim=c(0, 0.75), main="Two Histograms") barplot(histxy[[2]]$density, col=rgb(1,0,0,1/4), ylim=c(0, 0.75), add=TRUE) barplot(hh$density, main="Barycenter", ylim=c(0, 0.75)) par(opar)
Given multiple histograms represented as "histogram"
S3 objects, compute the
Wasserstein median of order 2. We need one requirement that all histograms in an
input list hists
must have same breaks. See the example on how to
construct a histogram on predefined breaks/bins.
histmed22Y(hists, weights = NULL, lambda = NULL, ...)
histmed22Y(hists, weights = NULL, lambda = NULL, ...)
hists |
a length- |
weights |
a weight of each image; if |
lambda |
a regularization parameter; if |
... |
extra parameters including
|
a "histogram"
object of the Wasserstein median histogram.
#---------------------------------------------------------------------- # Binned from Two Gaussians # # EXAMPLE : small example for CRAN for visualization purpose. #---------------------------------------------------------------------- # GENERATE FROM TWO GAUSSIANS WITH DIFFERENT MEANS set.seed(100) x = stats::rnorm(1000, mean=-4, sd=0.5) y = stats::rnorm(1000, mean=+4, sd=0.5) bk = seq(from=-10, to=10, length.out=20) # HISTOGRAMS WITH COMMON BREAKS histxy = list() histxy[[1]] = hist(x, breaks=bk, plot=FALSE) histxy[[2]] = hist(y, breaks=bk, plot=FALSE) # COMPUTE hmean = histbary15B(histxy) hmeds = histmed22Y(histxy) # VISUALIZE opar <- par(no.readonly=TRUE) par(mfrow=c(1,3)) barplot(histxy[[1]]$density, col=rgb(0,0,1,1/4), ylim=c(0, 1.05), main="Two Histograms") barplot(histxy[[2]]$density, col=rgb(1,0,0,1/4), ylim=c(0, 1.05), add=TRUE) barplot(hmean$density, main="Barycenter", ylim=c(0, 1.05)) barplot(hmeds$density, main="Wasserstein Median", ylim=c(0, 1.05)) par(opar)
#---------------------------------------------------------------------- # Binned from Two Gaussians # # EXAMPLE : small example for CRAN for visualization purpose. #---------------------------------------------------------------------- # GENERATE FROM TWO GAUSSIANS WITH DIFFERENT MEANS set.seed(100) x = stats::rnorm(1000, mean=-4, sd=0.5) y = stats::rnorm(1000, mean=+4, sd=0.5) bk = seq(from=-10, to=10, length.out=20) # HISTOGRAMS WITH COMMON BREAKS histxy = list() histxy[[1]] = hist(x, breaks=bk, plot=FALSE) histxy[[2]] = hist(y, breaks=bk, plot=FALSE) # COMPUTE hmean = histbary15B(histxy) hmeds = histmed22Y(histxy) # VISUALIZE opar <- par(no.readonly=TRUE) par(mfrow=c(1,3)) barplot(histxy[[1]]$density, col=rgb(0,0,1,1/4), ylim=c(0, 1.05), main="Two Histograms") barplot(histxy[[2]]$density, col=rgb(1,0,0,1/4), ylim=c(0, 1.05), add=TRUE) barplot(hmean$density, main="Barycenter", ylim=c(0, 1.05)) barplot(hmeds$density, main="Wasserstein Median", ylim=c(0, 1.05)) par(opar)
Using entropic regularization for Wasserstein barycenter computation, imagebary14C
finds a barycentric image given multiple images
.
Please note the followings; (1) we only take a matrix as an image so please
make it grayscale if not, (2) all images should be of same size - no resizing is performed.
imagebary14C(images, p = 2, weights = NULL, lambda = NULL, ...)
imagebary14C(images, p = 2, weights = NULL, lambda = NULL, ...)
images |
a length- |
p |
an exponent for the order of the distance (default: 2). |
weights |
a weight of each image; if |
lambda |
a regularization parameter; if |
... |
extra parameters including
|
an matrix of the barycentric image.
Cuturi M, Doucet A (2014). “Fast computation of wasserstein barycenters.” In Xing EP, Jebara T (eds.), Proceedings of the 31st international conference on international conference on machine learning - volume 32, volume 32 of Proceedings of machine learning research, 685–693.
## Not run: #---------------------------------------------------------------------- # MNIST Data with Digit 3 # # EXAMPLE 1 : Very Small Example for CRAN; just showing how to use it! # EXAMPLE 2 : Medium-size Example for Evolution of Output #---------------------------------------------------------------------- # EXAMPLE 1 data(digit3) datsmall = digit3[1:2] outsmall = imagebary14C(datsmall, maxiter=3) # EXAMPLE 2 : Barycenter of 100 Images # RANDOMLY SELECT THE IMAGES data(digit3) dat2 = digit3[sample(1:2000, 100)] # select 100 images # RUN SEQUENTIALLY run10 = imagebary14C(dat2, maxiter=10) # first 10 iterations run20 = imagebary14C(dat2, maxiter=10, init.image=run10) # run 40 more run50 = imagebary14C(dat2, maxiter=30, init.image=run20) # run 50 more # VISUALIZE opar <- par(no.readonly=TRUE) par(mfrow=c(2,3), pty="s") image(dat2[[sample(100,1)]], axes=FALSE, main="a random image") image(dat2[[sample(100,1)]], axes=FALSE, main="a random image") image(dat2[[sample(100,1)]], axes=FALSE, main="a random image") image(run10, axes=FALSE, main="barycenter after 10 iter") image(run20, axes=FALSE, main="barycenter after 20 iter") image(run50, axes=FALSE, main="barycenter after 50 iter") par(opar) ## End(Not run)
## Not run: #---------------------------------------------------------------------- # MNIST Data with Digit 3 # # EXAMPLE 1 : Very Small Example for CRAN; just showing how to use it! # EXAMPLE 2 : Medium-size Example for Evolution of Output #---------------------------------------------------------------------- # EXAMPLE 1 data(digit3) datsmall = digit3[1:2] outsmall = imagebary14C(datsmall, maxiter=3) # EXAMPLE 2 : Barycenter of 100 Images # RANDOMLY SELECT THE IMAGES data(digit3) dat2 = digit3[sample(1:2000, 100)] # select 100 images # RUN SEQUENTIALLY run10 = imagebary14C(dat2, maxiter=10) # first 10 iterations run20 = imagebary14C(dat2, maxiter=10, init.image=run10) # run 40 more run50 = imagebary14C(dat2, maxiter=30, init.image=run20) # run 50 more # VISUALIZE opar <- par(no.readonly=TRUE) par(mfrow=c(2,3), pty="s") image(dat2[[sample(100,1)]], axes=FALSE, main="a random image") image(dat2[[sample(100,1)]], axes=FALSE, main="a random image") image(dat2[[sample(100,1)]], axes=FALSE, main="a random image") image(run10, axes=FALSE, main="barycenter after 10 iter") image(run20, axes=FALSE, main="barycenter after 20 iter") image(run50, axes=FALSE, main="barycenter after 50 iter") par(opar) ## End(Not run)
Using entropic regularization for Wasserstein barycenter computation, imagebary15B
finds a barycentric image given multiple images
.
Please note the followings; (1) we only take a matrix as an image so please
make it grayscale if not, (2) all images should be of same size - no resizing is performed.
imagebary15B(images, p = 2, weights = NULL, lambda = NULL, ...)
imagebary15B(images, p = 2, weights = NULL, lambda = NULL, ...)
images |
a length- |
p |
an exponent for the order of the distance (default: 2). |
weights |
a weight of each image; if |
lambda |
a regularization parameter; if |
... |
extra parameters including
|
an matrix of the barycentric image.
Benamou J, Carlier G, Cuturi M, Nenna L, Peyré G (2015). “Iterative Bregman Projections for Regularized Transportation Problems.” SIAM Journal on Scientific Computing, 37(2), A1111–A1138. ISSN 1064-8275, 1095-7197.
#---------------------------------------------------------------------- # MNIST Data with Digit 3 # # EXAMPLE 1 : Very Small Example for CRAN; just showing how to use it! # EXAMPLE 2 : Medium-size Example for Evolution of Output #---------------------------------------------------------------------- # EXAMPLE 1 data(digit3) datsmall = digit3[1:2] outsmall = imagebary15B(datsmall, maxiter=3) ## Not run: # EXAMPLE 2 : Barycenter of 100 Images # RANDOMLY SELECT THE IMAGES data(digit3) dat2 = digit3[sample(1:2000, 100)] # select 100 images # RUN SEQUENTIALLY run05 = imagebary15B(dat2, maxiter=5) # first 5 iterations run10 = imagebary15B(dat2, maxiter=5, init.image=run05) # run 5 more run50 = imagebary15B(dat2, maxiter=40, init.image=run10) # run 40 more # VISUALIZE opar <- par(no.readonly=TRUE) par(mfrow=c(2,3), pty="s") image(dat2[[sample(100,1)]], axes=FALSE, main="a random image") image(dat2[[sample(100,1)]], axes=FALSE, main="a random image") image(dat2[[sample(100,1)]], axes=FALSE, main="a random image") image(run05, axes=FALSE, main="barycenter after 05 iter") image(run10, axes=FALSE, main="barycenter after 10 iter") image(run50, axes=FALSE, main="barycenter after 50 iter") par(opar) ## End(Not run)
#---------------------------------------------------------------------- # MNIST Data with Digit 3 # # EXAMPLE 1 : Very Small Example for CRAN; just showing how to use it! # EXAMPLE 2 : Medium-size Example for Evolution of Output #---------------------------------------------------------------------- # EXAMPLE 1 data(digit3) datsmall = digit3[1:2] outsmall = imagebary15B(datsmall, maxiter=3) ## Not run: # EXAMPLE 2 : Barycenter of 100 Images # RANDOMLY SELECT THE IMAGES data(digit3) dat2 = digit3[sample(1:2000, 100)] # select 100 images # RUN SEQUENTIALLY run05 = imagebary15B(dat2, maxiter=5) # first 5 iterations run10 = imagebary15B(dat2, maxiter=5, init.image=run05) # run 5 more run50 = imagebary15B(dat2, maxiter=40, init.image=run10) # run 40 more # VISUALIZE opar <- par(no.readonly=TRUE) par(mfrow=c(2,3), pty="s") image(dat2[[sample(100,1)]], axes=FALSE, main="a random image") image(dat2[[sample(100,1)]], axes=FALSE, main="a random image") image(dat2[[sample(100,1)]], axes=FALSE, main="a random image") image(run05, axes=FALSE, main="barycenter after 05 iter") image(run10, axes=FALSE, main="barycenter after 10 iter") image(run50, axes=FALSE, main="barycenter after 50 iter") par(opar) ## End(Not run)
Given multiple images , the Wasserstein median of
order 2 is computed. The proposed method relies on a choice of barycenter computation
in that we opt for an algorithm of
imagebary15B
, which uses
entropic regularization for barycenter computation. Please note the followings; (1) we only take a matrix as an image so please
make it grayscale if not, (2) all images should be of same size - no resizing is performed.
imagemed22Y(images, weights = NULL, lambda = NULL, ...)
imagemed22Y(images, weights = NULL, lambda = NULL, ...)
images |
a length- |
weights |
a weight of each image; if |
lambda |
a regularization parameter; if |
... |
extra parameters including
|
an matrix of the Wasserstein median image.
## Not run: #---------------------------------------------------------------------- # MNIST Data with Digit 3 # # EXAMPLE : Very Small Example for CRAN; just showing how to use it! #---------------------------------------------------------------------- # LOAD THE DATA data(digit3) datsmall = digit3[1:10] # COMPUTE outsmall = imagemed22Y(datsmall, maxiter=5) # VISUALIZE opar <- par(no.readonly=TRUE) par(mfrow=c(1,4), pty="s") image(outsmall, xaxt='n', yaxt='n', main="Wasserstein Median") image(datsmall[[3]], xaxt='n', yaxt='n', main="3rd image") image(datsmall[[6]], xaxt='n', yaxt='n', main="6th image") image(datsmall[[9]], xaxt='n', yaxt='n', main="9th image") par(opar) ## End(Not run)
## Not run: #---------------------------------------------------------------------- # MNIST Data with Digit 3 # # EXAMPLE : Very Small Example for CRAN; just showing how to use it! #---------------------------------------------------------------------- # LOAD THE DATA data(digit3) datsmall = digit3[1:10] # COMPUTE outsmall = imagemed22Y(datsmall, maxiter=5) # VISUALIZE opar <- par(no.readonly=TRUE) par(mfrow=c(1,4), pty="s") image(outsmall, xaxt='n', yaxt='n', main="Wasserstein Median") image(datsmall[[3]], xaxt='n', yaxt='n', main="3rd image") image(datsmall[[6]], xaxt='n', yaxt='n', main="6th image") image(datsmall[[9]], xaxt='n', yaxt='n', main="9th image") par(opar) ## End(Not run)
Due to high computational cost for linear programming approaches to compute
Wasserstein distance, Cuturi (2013) proposed an entropic regularization
scheme as an efficient approximation to the original problem. This comes with
a regularization parameter in the term
IPOT algorithm is known to be relatively robust to the choice of
regularization parameter . Empirical observation says that
very small number of inner loop iteration like
L=1
is sufficient.
ipot(X, Y, p = 2, wx = NULL, wy = NULL, lambda = 1, ...) ipotD(D, p = 2, wx = NULL, wy = NULL, lambda = 1, ...)
ipot(X, Y, p = 2, wx = NULL, wy = NULL, lambda = 1, ...) ipotD(D, p = 2, wx = NULL, wy = NULL, lambda = 1, ...)
X |
an |
Y |
an |
p |
an exponent for the order of the distance (default: 2). |
wx |
a length- |
wy |
a length- |
lambda |
a regularization parameter (default: 0.1). |
... |
extra parameters including
|
D |
an |
a named list containing
distance value
the number of iterations it took to converge.
an nonnegative matrix for the optimal transport plan.
Xie Y, Wang X, Wang R, Zha H (2020). “A fast proximal point method for computing exact wasserstein distance.” In Adams RP, Gogate V (eds.), Proceedings of The 35th Uncertainty in Artificial Intelligence Conference, volume 115 of Proceedings of machine learning research, 433–453.
#------------------------------------------------------------------- # Wasserstein Distance between Samples from Two Bivariate Normal # # * class 1 : samples from Gaussian with mean=(-1, -1) # * class 2 : samples from Gaussian with mean=(+1, +1) #------------------------------------------------------------------- ## SMALL EXAMPLE set.seed(100) m = 20 n = 30 X = matrix(rnorm(m*2, mean=-1),ncol=2) # m obs. for X Y = matrix(rnorm(n*2, mean=+1),ncol=2) # n obs. for Y ## COMPARE WITH WASSERSTEIN outw = wasserstein(X, Y) ipt1 = ipot(X, Y, lambda=1) ipt2 = ipot(X, Y, lambda=10) ## VISUALIZE : SHOW THE PLAN AND DISTANCE pmw = paste0("wasserstein plan ; dist=",round(outw$distance,2)) pm1 = paste0("ipot lbd=1 ; dist=",round(ipt1$distance,2)) pm2 = paste0("ipot lbd=10; dist=",round(ipt2$distance,2)) opar <- par(no.readonly=TRUE) par(mfrow=c(1,3)) image(outw$plan, axes=FALSE, main=pmw) image(ipt1$plan, axes=FALSE, main=pm1) image(ipt2$plan, axes=FALSE, main=pm2) par(opar)
#------------------------------------------------------------------- # Wasserstein Distance between Samples from Two Bivariate Normal # # * class 1 : samples from Gaussian with mean=(-1, -1) # * class 2 : samples from Gaussian with mean=(+1, +1) #------------------------------------------------------------------- ## SMALL EXAMPLE set.seed(100) m = 20 n = 30 X = matrix(rnorm(m*2, mean=-1),ncol=2) # m obs. for X Y = matrix(rnorm(n*2, mean=+1),ncol=2) # n obs. for Y ## COMPARE WITH WASSERSTEIN outw = wasserstein(X, Y) ipt1 = ipot(X, Y, lambda=1) ipt2 = ipot(X, Y, lambda=10) ## VISUALIZE : SHOW THE PLAN AND DISTANCE pmw = paste0("wasserstein plan ; dist=",round(outw$distance,2)) pm1 = paste0("ipot lbd=1 ; dist=",round(ipt1$distance,2)) pm2 = paste0("ipot lbd=10; dist=",round(ipt2$distance,2)) opar <- par(no.readonly=TRUE) par(mfrow=c(1,3)) image(outw$plan, axes=FALSE, main=pmw) image(ipt1$plan, axes=FALSE, main=pm1) image(ipt2$plan, axes=FALSE, main=pm2) par(opar)
Due to high computational cost for linear programming approaches to compute
Wasserstein distance, Cuturi (2013) proposed an entropic regularization
scheme as an efficient approximation to the original problem. This comes with
a regularization parameter in the term
As ,
the solution to an approximation problem approaches to the solution of a
true problem. However, we have an issue with numerical underflow. Our
implementation returns an error when it happens, so please use a larger number
when necessary.
sinkhorn(X, Y, p = 2, wx = NULL, wy = NULL, lambda = 0.1, ...) sinkhornD(D, p = 2, wx = NULL, wy = NULL, lambda = 0.1, ...)
sinkhorn(X, Y, p = 2, wx = NULL, wy = NULL, lambda = 0.1, ...) sinkhornD(D, p = 2, wx = NULL, wy = NULL, lambda = 0.1, ...)
X |
an |
Y |
an |
p |
an exponent for the order of the distance (default: 2). |
wx |
a length- |
wy |
a length- |
lambda |
a regularization parameter (default: 0.1). |
... |
extra parameters including
|
D |
an |
a named list containing
distance value.
the number of iterations it took to converge.
an nonnegative matrix for the optimal transport plan.
Cuturi M (2013). “Sinkhorn distances: Lightspeed computation of optimal transport.” In Proceedings of the 26th international conference on neural information processing systems - volume 2, NIPS'13, 2292–2300.
#------------------------------------------------------------------- # Wasserstein Distance between Samples from Two Bivariate Normal # # * class 1 : samples from Gaussian with mean=(-1, -1) # * class 2 : samples from Gaussian with mean=(+1, +1) #------------------------------------------------------------------- ## SMALL EXAMPLE set.seed(100) m = 20 n = 10 X = matrix(rnorm(m*2, mean=-1),ncol=2) # m obs. for X Y = matrix(rnorm(n*2, mean=+1),ncol=2) # n obs. for Y ## COMPARE WITH WASSERSTEIN outw = wasserstein(X, Y) skh1 = sinkhorn(X, Y, lambda=0.05) skh2 = sinkhorn(X, Y, lambda=0.10) ## VISUALIZE : SHOW THE PLAN AND DISTANCE pm1 = paste0("wasserstein plan ; distance=",round(outw$distance,2)) pm2 = paste0("sinkhorn lbd=0.05; distance=",round(skh1$distance,2)) pm5 = paste0("sinkhorn lbd=0.1 ; distance=",round(skh2$distance,2)) opar <- par(no.readonly=TRUE) par(mfrow=c(1,3)) image(outw$plan, axes=FALSE, main=pm1) image(skh1$plan, axes=FALSE, main=pm2) image(skh2$plan, axes=FALSE, main=pm5) par(opar)
#------------------------------------------------------------------- # Wasserstein Distance between Samples from Two Bivariate Normal # # * class 1 : samples from Gaussian with mean=(-1, -1) # * class 2 : samples from Gaussian with mean=(+1, +1) #------------------------------------------------------------------- ## SMALL EXAMPLE set.seed(100) m = 20 n = 10 X = matrix(rnorm(m*2, mean=-1),ncol=2) # m obs. for X Y = matrix(rnorm(n*2, mean=+1),ncol=2) # n obs. for Y ## COMPARE WITH WASSERSTEIN outw = wasserstein(X, Y) skh1 = sinkhorn(X, Y, lambda=0.05) skh2 = sinkhorn(X, Y, lambda=0.10) ## VISUALIZE : SHOW THE PLAN AND DISTANCE pm1 = paste0("wasserstein plan ; distance=",round(outw$distance,2)) pm2 = paste0("sinkhorn lbd=0.05; distance=",round(skh1$distance,2)) pm5 = paste0("sinkhorn lbd=0.1 ; distance=",round(skh2$distance,2)) opar <- par(no.readonly=TRUE) par(mfrow=c(1,3)) image(outw$plan, axes=FALSE, main=pm1) image(skh1$plan, axes=FALSE, main=pm2) image(skh2$plan, axes=FALSE, main=pm5) par(opar)
Sliced Wasserstein (SW) Distance (Rabin et al. 2012)
is a popular alternative to the standard Wasserstein distance due to its computational
efficiency on top of nice theoretical properties. For the -dimensional probability
measures
and
, the SW distance is defined as
where is the
-dimensional unit hypersphere and
is the uniform distribution on
. Practically,
it is computed via Monte Carlo integration.
swdist(X, Y, p = 2, ...)
swdist(X, Y, p = 2, ...)
X |
an |
Y |
an |
p |
an exponent for the order of the distance (default: 2). |
... |
extra parameters including
|
a named list containing
distance value.
a length-niter
vector of projected univariate distances.
Rabin J, Peyré G, Delon J, Bernot M (2012). “Wasserstein Barycenter and Its Application to Texture Mixing.” In Bruckstein AM, ter Haar Romeny BM, Bronstein AM, Bronstein MM (eds.), Scale Space and Variational Methods in Computer Vision, volume 6667, 435–446. Springer Berlin Heidelberg, Berlin, Heidelberg.
#------------------------------------------------------------------- # Sliced-Wasserstein Distance between Two Bivariate Normal # # * class 1 : samples from Gaussian with mean=(-1, -1) # * class 2 : samples from Gaussian with mean=(+1, +1) #------------------------------------------------------------------- # SMALL EXAMPLE set.seed(100) m = 20 n = 30 X = matrix(rnorm(m*2, mean=-1),ncol=2) # m obs. for X Y = matrix(rnorm(n*2, mean=+1),ncol=2) # n obs. for Y # COMPUTE THE SLICED-WASSERSTEIN DISTANCE outsw <- swdist(X, Y, nproj=100) # VISUALIZE # prepare ingredients for plotting plot_x = 1:1000 plot_y = base::cumsum(outsw$projdist)/plot_x # draw opar <- par(no.readonly=TRUE) plot(plot_x, plot_y, type="b", cex=0.1, lwd=2, xlab="number of MC samples", ylab="distance", main="Effect of MC Sample Size") abline(h=outsw$distance, col="red", lwd=2) legend("bottomright", legend="SW Distance", col="red", lwd=2) par(opar)
#------------------------------------------------------------------- # Sliced-Wasserstein Distance between Two Bivariate Normal # # * class 1 : samples from Gaussian with mean=(-1, -1) # * class 2 : samples from Gaussian with mean=(+1, +1) #------------------------------------------------------------------- # SMALL EXAMPLE set.seed(100) m = 20 n = 30 X = matrix(rnorm(m*2, mean=-1),ncol=2) # m obs. for X Y = matrix(rnorm(n*2, mean=+1),ncol=2) # n obs. for Y # COMPUTE THE SLICED-WASSERSTEIN DISTANCE outsw <- swdist(X, Y, nproj=100) # VISUALIZE # prepare ingredients for plotting plot_x = 1:1000 plot_y = base::cumsum(outsw$projdist)/plot_x # draw opar <- par(no.readonly=TRUE) plot(plot_x, plot_y, type="b", cex=0.1, lwd=2, xlab="number of MC samples", ylab="distance", main="Effect of MC Sample Size") abline(h=outsw$distance, col="red", lwd=2) legend("bottomright", legend="SW Distance", col="red", lwd=2) par(opar)
Given two empirical measures consisting of
and
observations on
,
-Wasserstein distance for
between two empirical measures
is defined as
where denotes the collection of all measures/couplings on
whose marginals are
and
on the first and second factors, respectively. Please see the section
for detailed description on the usage of the function.
wasserstein(X, Y, p = 2, wx = NULL, wy = NULL) wassersteinD(D, p = 2, wx = NULL, wy = NULL)
wasserstein(X, Y, p = 2, wx = NULL, wy = NULL) wassersteinD(D, p = 2, wx = NULL, wy = NULL)
X |
an |
Y |
an |
p |
an exponent for the order of the distance (default: 2). |
wx |
a length- |
wy |
a length- |
D |
an |
a named list containing
distance value.
an nonnegative matrix for the optimal transport plan.
wasserstein()
functionWe assume empirical measures are defined on the Euclidean space ,
and the distance metric used here is standard Euclidean norm . Here, the
marginals
and
correspond to
wx
and wy
, respectively.
wassersteinD()
functionIf other distance measures or underlying spaces are one's interests, we have an option for users to provide
a distance matrix D
rather than vectors, where
for flexible modeling.
Peyré G, Cuturi M (2019). “Computational Optimal Transport: With Applications to Data Science.” Foundations and Trends® in Machine Learning, 11(5-6), 355–607. ISSN 1935-8237, 1935-8245.
#------------------------------------------------------------------- # Wasserstein Distance between Samples from Two Bivariate Normal # # * class 1 : samples from Gaussian with mean=(-1, -1) # * class 2 : samples from Gaussian with mean=(+1, +1) #------------------------------------------------------------------- ## SMALL EXAMPLE m = 20 n = 10 X = matrix(rnorm(m*2, mean=-1),ncol=2) # m obs. for X Y = matrix(rnorm(n*2, mean=+1),ncol=2) # n obs. for Y ## COMPUTE WITH DIFFERENT ORDERS out1 = wasserstein(X, Y, p=1) out2 = wasserstein(X, Y, p=2) out5 = wasserstein(X, Y, p=5) ## VISUALIZE : SHOW THE PLAN AND DISTANCE pm1 = paste0("plan p=1; distance=",round(out1$distance,2)) pm2 = paste0("plan p=2; distance=",round(out2$distance,2)) pm5 = paste0("plan p=5; distance=",round(out5$distance,2)) opar <- par(no.readonly=TRUE) par(mfrow=c(1,3)) image(out1$plan, axes=FALSE, main=pm1) image(out2$plan, axes=FALSE, main=pm2) image(out5$plan, axes=FALSE, main=pm5) par(opar) ## Not run: ## COMPARE WITH ANALYTIC RESULTS # For two Gaussians with same covariance, their # 2-Wasserstein distance is known so let's compare ! niter = 1000 # number of iterations vdist = rep(0,niter) for (i in 1:niter){ mm = sample(30:50, 1) nn = sample(30:50, 1) X = matrix(rnorm(mm*2, mean=-1),ncol=2) Y = matrix(rnorm(nn*2, mean=+1),ncol=2) vdist[i] = wasserstein(X, Y, p=2)$distance if (i%%10 == 0){ print(paste0("iteration ",i,"/", niter," complete.")) } } # Visualize opar <- par(no.readonly=TRUE) hist(vdist, main="Monte Carlo Simulation") abline(v=sqrt(8), lwd=2, col="red") par(opar) ## End(Not run)
#------------------------------------------------------------------- # Wasserstein Distance between Samples from Two Bivariate Normal # # * class 1 : samples from Gaussian with mean=(-1, -1) # * class 2 : samples from Gaussian with mean=(+1, +1) #------------------------------------------------------------------- ## SMALL EXAMPLE m = 20 n = 10 X = matrix(rnorm(m*2, mean=-1),ncol=2) # m obs. for X Y = matrix(rnorm(n*2, mean=+1),ncol=2) # n obs. for Y ## COMPUTE WITH DIFFERENT ORDERS out1 = wasserstein(X, Y, p=1) out2 = wasserstein(X, Y, p=2) out5 = wasserstein(X, Y, p=5) ## VISUALIZE : SHOW THE PLAN AND DISTANCE pm1 = paste0("plan p=1; distance=",round(out1$distance,2)) pm2 = paste0("plan p=2; distance=",round(out2$distance,2)) pm5 = paste0("plan p=5; distance=",round(out5$distance,2)) opar <- par(no.readonly=TRUE) par(mfrow=c(1,3)) image(out1$plan, axes=FALSE, main=pm1) image(out2$plan, axes=FALSE, main=pm2) image(out5$plan, axes=FALSE, main=pm5) par(opar) ## Not run: ## COMPARE WITH ANALYTIC RESULTS # For two Gaussians with same covariance, their # 2-Wasserstein distance is known so let's compare ! niter = 1000 # number of iterations vdist = rep(0,niter) for (i in 1:niter){ mm = sample(30:50, 1) nn = sample(30:50, 1) X = matrix(rnorm(mm*2, mean=-1),ncol=2) Y = matrix(rnorm(nn*2, mean=+1),ncol=2) vdist[i] = wasserstein(X, Y, p=2)$distance if (i%%10 == 0){ print(paste0("iteration ",i,"/", niter," complete.")) } } # Visualize opar <- par(no.readonly=TRUE) hist(vdist, main="Monte Carlo Simulation") abline(v=sqrt(8), lwd=2, col="red") par(opar) ## End(Not run)