Matrix Decomposition

2024-12-26 @Atsushi Kawaguchi

In this vignette, the output is omitted. Please refer to the following book for the output.

Kawaguchi A. (2021). Multivariate Analysis for Neuroimaging Data. CRC Press.

Data reshape

Preparation

First, we prepare and then we create the data matrix. Install package (as necessary)

if(!require("mand")) install.packages("mand")

Load package

library(mand)

Generate Simulation Data

The mand package has a function to generate simulation brain data from the base image, the difference image and the standard deviation image. These basic images are loaded as follows.

data(baseimg)
data(diffimg)
data(mask)
data(sdevimg)

The number of voxels in the original 3D image is as follows.

dim(baseimg)

To understand the result easily, the difference region was restricted to Parahippocampus and Hippocampus.

diffimg2 = diffimg * (tmpatlas %in% 37:40)

An image data matrix with subjects in the rows and voxels in the columns was generated by using the simbrain function.

img1 = simbrain(baseimg = baseimg, diffimg = diffimg2, sdevimg=sdevimg, mask=mask, n0=20, c1=0.01, sd1=0.05)

The base image, the difference image and the standard deviation image were specified in the first three arguments. The out-of-brain region was specified by the mask argument, which was the binary image. The remaining arguments are the number of subjects per group, the coefficient multiplied by the difference image and the standard deviation for the noise.

The data matrix dimension was as follows.

dim(img1$S)

The rec function creates the 3D image from the vectorized data (the first subject).

coat(rec(img1$S[1,], img1$imagedim, mask=img1$brainpos))

The standard deviation image is created from the resulting data matrix.

sdimg = apply(img1$S, 2, sd)
coat(template, rec(sdimg, img1$imagedim, mask=img1$brainpos))

Principal Component Analysis

If the input is a matrix, a principal component analysis is implemented by the msma function of the msma package. Principal component analysis with the number of components of 2 for the image data matrix is performed as follows.

(fit111 = msma(img1$S, comp=2))

The scatter plots for the score vectors specified by the argument v. The argument axes is specified by the two length vectors on which components are displayed.

plot(fit111, v="score", axes = 1:2, plottype="scatter")

The weight (loading) vectors can be obtained and reconstructed as follows.

midx = 1 ## the index for the modality
vidx = 1 ## the index for the component
Q = fit111$wbX[[midx]][,vidx]
outstat1 = rec(Q, img1$imagedim, mask=img1$brainpos)

The reconstructed loadings as image is overlayed on the template.

coat(template, outstat1)

The output is unclear; however, this will be improved later.

Two-steps dimension reduction

Methods

This is an example of the two-step dimension reduction.

Generate radial basis function.

B1 = rbfunc(imagedim=img1$imagedim, seppix=2, hispec=FALSE, 
mask=img1$brainpos)

Multiplying the basis function to image data matrix.

SB1 = basisprod(img1$S, B1)

The original dimension was as follows.

dim(img1$S)

The dimension was reduced to as follows.

dim(SB1)

Principal Component Analysis

The Principal Component Analysis (PCA) is applied to the dimension-reduced image.

(fit211 = msma(SB1, comp=2))

The loading is reconstructed to the original space by using the rec function.

Q = fit211$wbX[[1]][,1]
outstat1 = rec(Q, img1$imagedim, B=B1, mask=img1$brainpos)

The plotted (sign-flipping) loading is smoother than the one without the dimension reduction by the basis function.

outstat2 = -outstat1
coat(template, outstat2)

Sparse PCA

If lambdaX (>0) is specified, a sparse principal component analysis is implemented.

(fit112 = msma(SB1, comp=2, lambdaX=0.075))

The plotted loading is narrower than that from the PCA.

Q = fit112$wbX[[midx]][,vidx]
outstat1 = rec(Q, img1$imagedim, B=B1, mask=img1$brainpos)
outstat2 = outstat1
coat(template, outstat2)

The region is reported as follows to be compared with the next method.

atlastable(tmpatlas, outstat2, atlasdataset)

Supervised Sparse PCA

The simbrain generates the synthetic brain image data and the binary outcome. The outcome Z is obtained.

Z = img1$Z

If the outcome Z is specified in the msma function, a supervised sparse principal component analysis is implemented.

(fit113 = msma(SB1, Z=Z, comp=2, lambdaX=0.075, muX=0.5))

The plotted loading is located differently from the sparse PCA.

Q = fit113$wbX[[1]][,1]
outstat1 = rec(Q, img1$imagedim, B=B1, mask=img1$brainpos)
outstat2 = -outstat1
coat(template, outstat2)

The region near the hippocampus, which differs from the sparse PCA (without supervision).

atlastable(tmpatlas, outstat2, atlasdataset)

The loading for the second component

Q = fit113$wbX[[1]][,2]
outstat1 = rec(Q, img1$imagedim, B=B1, mask=img1$brainpos)
outstat2 = -outstat1
coat(template, outstat2)
atlastable(tmpatlas, outstat2, atlasdataset)

This is similar to the result from the sparse PCA (without supervision).

The following method can be used to plot the weights of several components simultaneously. It is first reconstructed in three dimensions with the multirec function and then plotted with the multicompplot function. It is set to display four columns per component.

ws = multirec(fit113, imagedim=img1$imagedim, B=B1, 
mask=img1$brainpos)
multicompplot(ws, template, col4comp=4)

Characteristics

In this section, we perform some experiments to show the characteristics of the methods in the previous section.

Impact seppix

The basis function is parameterized by the radius, which is the argument seppix of the rbfunc function. The msma function is applied.

seppixs = 2:7
fit115s = lapply(seppixs, function(sp){ 
B1 = rbfunc(imagedim=img1$imagedim, seppix=sp,
 hispec=FALSE, mask=img1$brainpos)
SB1 = basisprod(img1$S, B1)
fit=msma(SB1, Z=Z, comp=2, lambdaX=0.075, muX=0.5)
list(fit=fit, B1=B1)
})
par(mfrow=c(2,3), mar=c(1,2,1,2))
for(i in 1:length(seppixs)){
Q = fit115s[[i]]$fit$wbX[[midx]][,vidx]
outstat1 = rec(Q, img1$imagedim, B=fit115s[[i]]$B1, 
mask=img1$brainpos)
coat(template, -outstat1, pseq=10,color.bar=FALSE, 
paron=FALSE, main=paste("seppix =", seppixs[i]))
}

With the larger value it is difficult to determine the region.

Impact lambda

The iterative fits with different regularized parameters are implemented specifying them using the argument lambdaX.

lambdaXs = round(seq(0, 0.2, by=0.005), 3)
fit114s = lapply(lambdaXs, function(lam) 
msma(SB1, Z=Z, comp=2, lambdaX=lam, muX=0.5, type="lasso") )

For some of the parameters, the sparsity can be controlled by λ, as shown below.

lambdaXs2 = c(0, 0.025, 0.05, 0.075, 0.1, 0.15)
par(mfrow=c(2,3), mar=c(1,2,1,2))
for(i in which(lambdaXs %in% lambdaXs2)){
Q = fit114s[[i]]$wbX[[1]][,1]
outstat1 = rec(Q, img1$imagedim, B=B1, mask=img1$brainpos)
coat(template, -outstat1, pseq=10,color.bar=FALSE,
 paron=FALSE, main=paste("lambda =", lambdaXs[i]))
}

The optimal value can be determined by the BIC.

nzwbXs = unlist(lapply(fit114s, function(x) x$nzwbX[2]))
BICs = unlist(lapply(fit114s, function(x) x$bic[2]))
(optlam = lambdaXs[which.min(BICs)])
(optnzw = nzwbXs[which.min(BICs)])

The plot of BIC values against the number of non-zero loadings and the λ.

par(mfrow=c(1,2))
plot(lambdaXs, BICs, xlab="lambda", ylab="BIC")
abline(v=optlam, col="red", lty=2)
plot(nzwbXs, BICs, xlab="Number of non-zeros", ylab="BIC", log="x")
abline(v=optnzw, col="red", lty=2)

The optimal BIC value was λ which was relatively better among reconstructed loading images with different λ’s as shown above.

Penalty Functions

For the msma function, four types of penalty functions are currently available. These depend on parameters, “lasso” and “hard” depend on one parameter, and “scad” and “mcp” depend on two parameters.

penalties2 = c("lasso", "hard", "scad", "mcp")
etas = list(lasso=1, hard=1, scad=c(1, 3.7), mcp=c(2, 3))

Using the internal function sparse in the mand package, we see how each penalty function performs the transformation.

xs = seq(-6, 6, by=0.1)
par(mfrow=c(2,3), mar=c(2,2,3,2))
for(p1 in penalties2){
eta1 = etas[[p1]]
for(e1 in eta1){
sout1 = sparse(xs, 2, type=p1, eta=e1)
plot(xs, sout1, xlab="", ylab="", 
main=paste(p1, "(eta =", e1, ")"), type="b")
}}

Next, we apply each penalty function in the msma function to the brain image data.

par(mfrow=c(2,3), mar=c(1,2,1,2))
for(p1 in penalties2){
eta1 = etas[[p1]]
for(e1 in eta1){
fit = msma(SB1, Z=Z, comp=2, lambdaX=0.025, muX=0.5,
 type=p1, eta=e1) 
Q = fit$wbX[[midx]][,vidx]
outstat1 = rec(Q, img1$imagedim, B=B1, mask=img1$brainpos)
outstat2 = -outstat1
coat(template, outstat2, pseq=10, color.bar=FALSE,
 paron=FALSE, main=paste(p1, "(eta =", e1, ")"))
}}

Parameter Selection

The number of components was set to be 30.

fit114 = msma(SB1, Z=Z, comp=30, muX=0.5)

The cumulative percentage of the explained variance (CPEV) is plotted against the numbers of components. When the argument v is specified as “cpev”, the CPEVs are plotted.

plot(fit114, v="cpev")
abline(h=0.8, lty=2)

The function ncompsearch searches for the optimal value based on the criterion.

(ncomp1 = ncompsearch(SB1, Z=Z, muX=0.5, 
comps = 50, criterion="BIC"))

The criterion can use not only BIC but also CV (Cross Validation).

(ncomp2 = ncompsearch(SB1, Z=Z, muX=0.5,
comps = c(1, seq(5, 30, by=5)), criterion="CV"))
par(mfrow=c(1,2))
plot(ncomp1)
plot(ncomp2)

For each component number from 1 to 5, the optimal λ was selected, and the number of non-zero loadings in each component was counted.

maxncomp = 5
opts = sapply(1:maxncomp, function(c1){ 
opt=regparasearch(SB1, Z=Z, comp=c1, muX=0.5)$optlambdaX
fit = msma(SB1, Z=Z, comp=c1, lambdaX=opt, muX=0.5)
nz = rep(NA, maxncomp)
nz[1:c1] = fit$nzwbX
c(c1, round(opt,3), nz)
})
opts1=t(opts)
colnames(opts1) = c("#comp", "lambda", paste("comp",1:maxncomp))
kable(opts1, "latex", booktabs = T)

From this result, it can be seen that when the number of components is large, a small non-zero loadings are selected for each component, and when the number of components is small, a large non-zero loadings are selected for each component.

(ncomp3 = ncompsearch(SB1, Z=Z, muX=0.5, lambdaX=0.075, comps = 30, criterion="BIC"))
plot(ncomp3)

Thus, there are several strategies for selecting the λ and the number of components. The msma package has an optparasearch function and the search.method argument allows you to select one of four methods.

The regparaonly method searches for the regularized parameters with a fixed number of components (set to be 5 and 10 as the default).

(opt11 = optparasearch(SB1, Z=Z, muX=0.5, comp=5, search.method = "regparaonly", criterion="BIC"))

The optimized parameters are used in the msma function as follows.

(fit311 = msma(SB1, Z=Z, muX=0.5, comp=opt11$optncomp, lambdaX=opt11$optlambdaX))

The regpara1st identifies the regularized parameters by fixing the number of components, then searching for the number of components with the selected regularized parameters. Note that the default is to search up to 10 components.

(opt12 = optparasearch(SB1, Z=Z, muX=0.5, search.method = "regpara1st", criterion="BIC"))
fit312 = msma(SB1, Z=Z, muX=0.5, comp=opt12$optncomp, lambdaX=opt12$optlambdaX)

The ncomp1st method identifies the number of components with a regularized parameter of 0, then searches for the regularized parameters with the selected number of components.

(opt13 = optparasearch(SB1, Z=Z, muX=0.5, search.method = "ncomp1st", criterion="BIC"))
fit313 = msma(SB1, Z=Z, muX=0.5, comp=opt13$optncomp, lambdaX=opt13$optlambdaX)

If the argument criterion4ncomp is specified as criterion4ncomp="CV", only the criteria for selecting the number of components can be changed to CV.

The simultaneous method identifies the number of components by searching the regularized parameters in each component.

(opt14 = optparasearch(SB1, Z=Z, muX=0.5, search.method = "simultaneous", criterion="BIC"))
fit314 = msma(SB1, Z=Z, muX=0.5, comp=opt14$optncomp, lambdaX=opt14$optlambdaX)

In this example, the results were the same except for thencomp1st.

Other Methods

The NMF can be implemented by invoking the nmf function of the NMF library.

if(!require("NMF")) install.packages("NMF")
library(NMF)

NMF can be performed on a brain image data matrix with the number of components as 2 in the following way.

res = nmf(SB1, 2)

Using the coef function, we can extract the weights and plot them on the brain image using the coat function in the same way as when applying the msma function.

Q = t(coef(res))[,1]
outstat1 = rec(Q, img1$imagedim, B=B1, mask=img1$brainpos)
coat(template, outstat1)

The ICA can be implemented by invoking the icaimax function of the ica library.

if(!require("ica")) install.packages("ica")
library(ica)

ICA using the information-maximization (Infomax) approach can be performed on a brain image data matrix with the number of components as 2 in the following way.

imod = icaimax(SB1,2)

In this case, the weights are extracted as follows and plotted on the brain image in the same way.

Q = imod$M[,1]
outstat1 = rec(Q, img1$imagedim, B=B1, mask=img1$brainpos)
coat(template, outstat1)

Cluster Analysis

The package for drawing more informative dendrograms is loaded.

library(dendextend)

By inputting fit111, which is the result of PCA being applied to the hcmsma function by the msma function, hierarchical clustering with the score as input is executed.

hcmsma111 = hcmsma(fit111)

The next step is to plot and draw the dendrogram.

dend = as.dendrogram(hcmsma111$hcout)
d1 = color_branches(dend, k=4, groupLabels=TRUE)
labels_colors(d1) = Z[as.numeric(labels(d1))]+1
plot(d1)

The data was set to be divided into four clusters. The number at the bottom of the dendrogram is the subject number, where black is the control and red is the case.

It seems that clusters 1 and 2 have many cases, and cluster 4 also seems to have many cases. The number is summarized as the matrix with the case or control in the row and the cluster to belong in the column.

clus=cutree(d1, 4, order_clusters_as_data = FALSE)
clus=clus[as.character(1:length(clus))]
table(Z, clus)

As can be seen, the PCA score did not yield a cluster that completely bisected case and control.

Similarly, we performed clustering using the scores calculated from the sparse PCA.

hcmsma112 = hcmsma(fit112)
dend = as.dendrogram(hcmsma112$hcout)
d1 = color_branches(dend, k=4, groupLabels=TRUE)
labels_colors(d1) = Z[as.numeric(labels(d1))]+1
plot(d1)
clus=cutree(d1, 4, order_clusters_as_data = FALSE)
clus=clus[as.character(1:length(clus))]
table(Z, clus)

This result shows that case and control are more sparsely clustered.

Finally, we performed clustering with scores from the supervised sparse PCA.

hcmsma113 = hcmsma(fit113)
dend = as.dendrogram(hcmsma113$hcout)
d1 = color_branches(dend, k=4, groupLabels=TRUE)
labels_colors(d1) = Z[as.numeric(labels(d1))]+1
plot(d1)
clus=cutree(d1, 4, order_clusters_as_data = FALSE)
clus=clus[as.character(1:length(clus))]
table(Z, clus)

Because they are supervised by this case or control, the case and control are nicely clustered. This analysis indicates that the case is further divided into two clusters. For practical purposes, it may be useful for disease subtype classification.