Package 'DDRTree'

Title: Learning Principal Graphs with DDRTree
Description: Provides an implementation of the framework of reversed graph embedding (RGE) which projects data into a reduced dimensional space while constructs a principal tree which passes through the middle of the data simultaneously. DDRTree shows superiority to alternatives (Wishbone, DPT) for inferring the ordering as well as the intrinsic structure of the single cell genomics data. In general, it could be used to reconstruct the temporal progression as well as bifurcation structure of any datatype.
Authors: Xiaojie Qiu, Cole Trapnell, Qi Mao, Li Wang
Maintainer: Xiaojie Qiu <[email protected]>
License: Artistic License 2.0
Version: 0.1.5
Built: 2024-11-01 06:41:34 UTC
Source: CRAN

Help Index


Perform DDRTree construction

Description

Perform DDRTree construction

This is an R and C code implementation of the DDRTree algorithm from Qi Mao, Li Wang et al.

Qi Mao, Li Wang, Steve Goodison, and Yijun Sun. Dimensionality Reduction via Graph Structure Learning. The 21st ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD'15), 2015
http://dl.acm.org/citation.cfm?id=2783309

to perform dimension reduction and principal graph learning simultaneously. Please cite this package and KDD'15 paper if you found DDRTree is useful for your research.

Usage

DDRTree(X, dimensions = 2, initial_method = NULL, maxIter = 20,
  sigma = 0.001, lambda = NULL, ncenter = NULL, param.gamma = 10,
  tol = 0.001, verbose = F, ...)

Arguments

X

a matrix with D×N\mathbf{D \times N} dimension which is needed to perform DDRTree construction

dimensions

reduced dimension

initial_method

a function to take the data transpose of X as input and then output the reduced dimension, row number should not larger than observation and column number should not be larger than variables (like isomap may only return matrix on valid sample sets). Sample names of returned reduced dimension should be preserved.

maxIter

maximum iterations

sigma

bandwidth parameter

lambda

regularization parameter for inverse graph embedding

ncenter

number of nodes allowed in the regularization graph

param.gamma

regularization parameter for k-means (the prefix of 'param' is used to avoid name collision with gamma)

tol

relative objective difference

verbose

emit extensive debug output

...

additional arguments passed to DDRTree

Value

a list with W, Z, stree, Y, history W is the orthogonal set of d (dimensions) linear basis vector Z is the reduced dimension space stree is the smooth tree graph embedded in the low dimension space Y represents latent points as the center of Z

Introduction

The unprecedented increase in big-data causes a huge difficulty in data visualization and downstream analysis. Conventional dimension reduction approaches (for example, PCA, ICA, Isomap, LLE, etc.) are limited in their ability to explictly recover the intrinisic structure from the data as well as the discriminative feature representation, both are important for scientific discovery. The DDRTree algorithm is a new algorithm to perform the following three tasks in one setting:

1. Reduce high dimension data into a low dimension space

2. Recover an explicit smooth graph structure with local geometry only captured by distances of data points in the low dimension space.

3. Obtain clustering structures of data points in reduced dimension

Dimensionality reduction via graph structure learning

Reverse graph embedding is previously applied to learn the intrinisic graph structure in the original dimension. The optimization of graph inference can be represented as:

minfgFmin{z1,...,zM}(Vi,Vj)Ebi,jfg(zi)fg(zj)2\mathop{min}_{f_g \in \mathcal{F}} \mathop{min}_{\{\mathbf{z}_1, ..., \mathbf{z}_M\}} \sum_{(V_i, V_j) \in \mathcal{E}} b_{i,j}||f_g(\mathbf{z}_i) - f_g(\mathbf{z}_j)||^2

where fgf_g is a function to map the instrinsic data space Z={z1,...,zM}\mathcal{Z} = \{\mathbf{z}_1, ..., \mathbf{z}_M\} back to the input data space (reverse embedding) X={x1,...,xN}\mathcal{X} = \{ \mathbf{x}_1, ..., \mathbf{x}_N\}. ViV_i is the the vertex of the instrinsic undirected graph G=(V,E)\mathcal{G} = (\mathcal{V}, \mathcal{E}). bijb_{ij} is the edge weight associates with the edge set E\mathcal{E}. In order to learn the intrinsic structure from a reduced dimension, we need also to consider a term which includes the error during the learning of the instrinsic structure. This strategy is incorporated as the following:

minGG^bminfgFmin{z1,...,zM}i=1Nxifg(zi)2+λ2(Vi,Vj)Ebi,jfg(zi)fg(zj)2\mathop{min}_{\mathcal{G} \in \hat{\mathcal{G}}_b}\mathop{min}_{f_g \in \mathcal{F}} \mathop{min}_{\{\mathbf{z}_1, ..., \mathbf{z}_M\}} \sum_{i = 1}^N ||\mathbf{x}_i - f_g (\mathbf{z}_i)||^2 + \frac{\lambda}{2} \sum_{(V_i, V_j) \in \mathcal{E}} b_{i,j}||f_g(\mathbf{z}_i) - f_g(\mathbf{z}_j)||^2

where λ\lambda is a non-negative parameter which controls the tradeoff between the data reconstruction error and the reverse graph embedding.

Dimensionality reduction via learning a tree

The general framework for reducing dimension by learning an intrinsic structure in a low dimension requires a feasible set G^b\hat{\mathcal{G}}_b of graph and a mapping function fGf_\mathcal{G}. The algorithm uses minimum spanning tree as the feasible tree graph structure, which can be solved by Kruskal' algoritm. A linear projection model fg(z)=Wzf_g (\mathbf{z}) = \mathbf{Wz} is used as the mapping function. Those setting results in the following specific form for the previous framework:

minW,Z,Bi=1NxiWzi2+λ2i,jbi,jWziWzj2\mathop{min}_{\mathbf{W}, \mathbf{Z}, \mathbf{B}} \sum_{i = 1}^N ||\mathbf{x}_i - \mathbf{W}\mathbf{z}_i||^2 + \frac{\lambda}{2} \sum_{i,j}b_{i,j}||\mathbf{W} \mathbf{z}_i - \mathbf{W} \mathbf{z}_j||^2

where W=[w1,...,wd]RD×d\mathbf{W} = [\mathbf{w}_1, ..., \mathbf{w}_d] \in \mathcal{R}^{D \times d} is an orthogonal set of dd linear basis vectors. We can group tree graph B\mathbf{B}, the orthogonal set of linear basis vectors and projected points in reduced dimension W,Z\mathbf{W}, \mathbf{Z} as two groups and apply alternative structure optimization to optimize the tree graph. This method is defined as DRtree (Dimension Reduction tree) as discussed by the authors.

Discriminative dimensionality reduction via learning a tree

In order to avoid the issues where data points scattered into different branches (which leads to lose of cluster information) and to incorporate the discriminative information,another set of points {yk}k=1K\{\mathbf{y}_k\}_{k = 1}^K as the centers of {zi}i=1N\{\mathbf{z}_i\}^N_{i = 1} can be also introduced. By so doing, the objective functions of K-means and the DRtree can be simulatenously minimized. The author further proposed a soft partition method to account for the limits from K-means and proposed the following objective function:

minW,Z,B,Y,Ri=1NxiWzi2+λ2k,kbk,kWykWyk2+γ[k=1Ki=1Nri,kziyk2+σΩ(R)]\mathop{min}_{\mathbf{W}, \mathbf{Z}, \mathbf{B}, \mathbf{Y}, \mathbf{R}} \sum_{i = 1}^N ||\mathbf{x}_i - \mathbf{W} \mathbf{z}_i||^2 + \frac{\lambda}{2} \sum_{k, k'}b_{k, k'}||\mathbf{W} \mathbf{y}_k - \mathbf{W} \mathbf{y}_k'||^2 + \gamma\Big[\sum_{k = 1}^K \sum_{i = 1}^N r_{i, k} ||\mathbf{z}_i - \mathbf{y}_k||^2 + \sigma \Omega (\mathbf{R})\Big]

s.t. WTW=I,BB,k=1Kri,k=1,ri,k0,i,ks.t.\ \mathbf{W}^T \mathbf{W} = \mathbf{I}, \mathbf{B} \in \mathcal{B}, \sum_{k = 1}^K r_{i, k} = 1, r_{i, k} \leq 0, \forall i, \forall k

where RRN×N,Ω(R)=i=1Nk=1kri,klog ri,k\mathbf{R} \in \mathcal{R}^{N \times N}, \Omega(\mathbf{R}) = \sum_{i = 1}^N \sum_{k = 1}^k r_{i, k} log\ r_{i, k} is the negative entropy regularization which transforms the hard assignments used in K-means into soft assignments and σ>0\sigma > 0 is the regulization parameter. Alternative structure optimization is again used to solve the above problem by separately optimize each group W,Z,Y,B,R{\mathbf{W}, \mathbf{Z}, \mathbf{Y}}, {\mathbf{B}, \mathbf{R}} until convergence.

The actual algorithm of DDRTree

1.1. Input\mathbf{Input}: Data matrix X\mathbf{X}, parameters λ,σ,γ\lambda, \sigma, \gamma
2.2. Initialize Z\mathbf{Z} by PCA
3.3. K=N,Y=ZK = N, \mathbf{Y} = \mathbf{Z}
4.4. repeat\mathbf{repeat}:
 5.\ 5. dk,k=ykyk2,k,kd_{k,k'} = ||\mathbf{y}_k - \mathbf{y}_{k'}||^2, \forall k, \forall k'
 6.\ 6. Obtain B\mathbf{B} via Kruskal's algorithm
 7.\ 7. L=diag(B1)B\mathbf{L} = diag(\mathbf{B1}) - \mathbf{B}
 8.\ 8. Compute R\mathbf{R} with each element
 9.\ 9. τ=diag(1TR)\tau = diag(\mathbf{1}^T \mathbf{R})
 10.\ 10. Q=11+γ[I+R(1+γγ(λγL+τ)RTR)1RT]\mathbf{Q} = \frac{1}{\mathbf{1} + \gamma} \Big[\mathbf{I} + \mathbf{R} (\frac{1 + \gamma}{\gamma}(\frac{\lambda}{\gamma} \mathbf{L} + \tau) - \mathbf{R}^T \mathbf{R})^{-1} \mathbf{R}^T\Big]
 11.\ 11. C=XQXT\mathbf{C} = \mathbf{X Q X}^T
 12.\ 12. Perform eigen-decomposition on C\mathbf{C} such that C=UUT\mathbf{C} = \mathbf{U} \wedge \mathbf{U}^T and diag()diag(\wedge) is sorted in a descending order
 13.\ 13. W=U(:,1:d)\mathbf{W} = \mathbf{U}(:, 1:d)
 14.\ 14. Z=WTXQ\mathbf{Z} = \mathbf{W}^T \mathbf{X Q}
 15.\ 15. Y=ZR(λγL+τ)1\mathbf{Y} = \mathbf{Z R} (\frac{\lambda}{\gamma} \mathbf{L} + \tau)^{-1}
16.16. Until\mathbf{Until} Convergence

Implementation of DDRTree algorithm

We implemented the algorithm mostly in Rcpp for the purpose of efficiency. It also has extensive optimization for sparse input data. This implementation is originally based on the matlab code provided from the author of DDRTree paper.

Examples

data('iris')
subset_iris_mat <- as.matrix(t(iris[c(1, 2, 52, 103), 1:4])) #subset the data
#run DDRTree with ncenters equal to species number
DDRTree_res <- DDRTree(subset_iris_mat, dimensions = 2, maxIter = 5, sigma = 1e-2,
lambda = 1, ncenter = 3, param.gamma = 10, tol = 1e-2, verbose = FALSE)
Z <- DDRTree_res$Z #obatain matrix
Y <- DDRTree_res$Y
stree <- DDRTree_res$stree
plot(Z[1, ], Z[2, ], col = iris[c(1, 2, 52, 103), 'Species']) #reduced dimension
legend("center", legend = unique(iris[c(1, 2, 52, 103), 'Species']), cex=0.8,
col=unique(iris[c(1, 2, 52, 103), 'Species']), pch = 1) #legend
title(main="DDRTree reduced dimension", col.main="red", font.main=4)
dev.off()
plot(Y[1, ], Y[2, ], col = 'blue', pch = 17) #center of the Z
title(main="DDRTree smooth principal curves", col.main="red", font.main=4)

#run DDRTree with ncenters equal to species number
DDRTree_res <- DDRTree(subset_iris_mat, dimensions = 2, maxIter = 5, sigma = 1e-3,
lambda = 1, ncenter = NULL,param.gamma = 10, tol = 1e-2, verbose = FALSE)
Z <- DDRTree_res$Z #obatain matrix
Y <- DDRTree_res$Y
stree <- DDRTree_res$stree
plot(Z[1, ], Z[2, ], col = iris[c(1, 2, 52, 103), 'Species']) #reduced dimension
legend("center", legend = unique(iris[c(1, 2, 52, 103), 'Species']), cex=0.8,
col=unique(iris[c(1, 2, 52, 103), 'Species']), pch = 1) #legend
title(main="DDRTree reduced dimension", col.main="red", font.main=4)
dev.off()
plot(Y[1, ], Y[2, ], col = 'blue', pch = 2) #center of the Z
title(main="DDRTree smooth principal graphs", col.main="red", font.main=4)

Get the top L eigenvalues

Description

Get the top L eigenvalues

Usage

get_major_eigenvalue(C, L)

Arguments

C

data matrix used for eigendecomposition

L

number for the top eigenvalues


Compute the PCA projection

Description

Compute the PCA projection

Usage

pca_projection_R(C, L)

Arguments

C

data matrix used for PCA projection

L

number for the top principal components


calculate the square distance between a, b

Description

calculate the square distance between a, b

Usage

sqdist_R(a, b)

Arguments

a

a matrix with D×ND \times N dimension

b

a matrix with D×ND \times N dimension

Value

a numeric value for the different between a and b