Title: | Sparse Independent Component Analysis |
---|---|
Description: | Provides an implementation of the Sparse ICA method in Wang et al. (2024) <doi:10.1080/01621459.2024.2370593> for estimating sparse independent source components of cortical surface functional MRI data, by addressing a non-smooth, non-convex optimization problem through the relax-and-split framework. This method effectively balances statistical independence and sparsity while maintaining computational efficiency. |
Authors: | Zihang Wang [aut, cre] , Irina Gaynanova [aut] , Aleksandr Aravkin [aut] , Benjamin Risk [aut] |
Maintainer: | Zihang Wang <[email protected]> |
License: | GPL-3 |
Version: | 0.1.4 |
Built: | 2025-01-29 20:33:15 UTC |
Source: | CRAN |
This function uses a BIC-like criterion to select the optimal tuning parameter nu
for Sparse ICA.
BIC_sparseICA( xData, n.comp, nu_list = seq(0.1, 4, 0.1), whiten = c("eigenvec", "sqrtprec", "none"), lngca = FALSE, orth.method = c("svd", "givens"), method = c("C", "R"), use_irlba = TRUE, eps = 1e-06, maxit = 500, verbose = FALSE, col.stand = TRUE, row.stand = FALSE, iter.stand = 0, BIC_plot = FALSE )
BIC_sparseICA( xData, n.comp, nu_list = seq(0.1, 4, 0.1), whiten = c("eigenvec", "sqrtprec", "none"), lngca = FALSE, orth.method = c("svd", "givens"), method = c("C", "R"), use_irlba = TRUE, eps = 1e-06, maxit = 500, verbose = FALSE, col.stand = TRUE, row.stand = FALSE, iter.stand = 0, BIC_plot = FALSE )
xData |
A numeric matrix of input data with dimensions P x T, where P is the number of features and T is the number of samples. |
n.comp |
An integer specifying the number of components to estimate. |
nu_list |
A numeric vector specifying the list of candidate tuning parameters. Default is |
whiten |
A character string specifying the method for whitening the input |
lngca |
A logical value indicating whether to perform Linear Non-Gaussian Component Analysis (LNGCA). Default is |
orth.method |
A character string specifying the method for generating initial values of the U matrix. Default is |
method |
A character string specifying the computation method. If |
use_irlba |
A logical value indicating whether to use the |
eps |
A numeric value specifying the convergence threshold. Default is |
maxit |
An integer specifying the maximum number of iterations for the Sparse ICA method using Laplace density. Default is 500. |
verbose |
A logical value indicating whether to print convergence information during execution. Default is |
col.stand |
A logical value indicating whether to standardize columns. For each column, the mean of the entries in the column equals 0, and the variance of the entries in the column equals 1. Default is |
row.stand |
A logical value indicating whether to standardize rows. For each row, the mean of the entries in the row equals 0, and the variance of the entries in the row equals 1. Default is |
iter.stand |
An integer specifying the number of iterations for achieving both row and column standardization when |
BIC_plot |
A logical value indicating whether to generate a plot showing the trace of BIC values for different |
A list containing the following elements:
BIC
A numeric vector of BIC values corresponding to each candidate nu
in nu_list
.
nu_list
A numeric vector of candidate tuning parameter values.
best_nu
The optimal nu
selected based on the BIC-like criterion.
#get simulated data data(example_sim123) select_sparseICA = BIC_sparseICA(xData = example_sim123$xmat, n.comp = 3, method="C", BIC_plot = TRUE,verbose = TRUE, nu_list = seq(0.1,4,0.1)) (my_nu = select_sparseICA$best_nu)
#get simulated data data(example_sim123) select_sparseICA = BIC_sparseICA(xData = example_sim123$xmat, n.comp = 3, method="C", BIC_plot = TRUE,verbose = TRUE, nu_list = seq(0.1,4,0.1)) (my_nu = select_sparseICA$best_nu)
This function scans a BIDS-formatted directory for subject-specific fMRI files that match a specified pattern and returns a list of these files for use in group ICA analysis.
create_group_list(bids_path, pattern = "task-rest.*\\.dtseries\\.nii$")
create_group_list(bids_path, pattern = "task-rest.*\\.dtseries\\.nii$")
bids_path |
A character string specifying the path to the root directory of the BIDS-formatted dataset. This directory should contain subject folders (e.g., |
pattern |
A character string specifying the pattern to match fMRI files. The default is |
A named list where each element corresponds to a subject directory and contains a vector of matched fMRI file paths. The names of the list are the subject IDs.
# Example usage: # Assuming `bids_dir` is the path to a BIDS dataset: # group_list <- create_group_list(bids_path = bids_dir, pattern = "task-rest.*\.dtseries\.nii$") # Print the structure of the list: # str(group_list)
# Example usage: # Assuming `bids_dir` is the path to a BIDS dataset: # group_list <- create_group_list(bids_path = bids_dir, pattern = "task-rest.*\.dtseries\.nii$") # Print the structure of the list: # str(group_list)
Estimate mixing matrix from estimates of components
est.M.ols(sData, xData, intercept = TRUE)
est.M.ols(sData, xData, intercept = TRUE)
sData |
S Dimension: P x Q |
xData |
X Dimension: P x T |
intercept |
default = TRUE |
a mixing matrix M, dimension Q x T.
A simple dataset for demonstration purposes.
example_sim123
example_sim123
A list containing 3 data frames:
A 1089 x 3 numeric matrix of the true source signals. Each column is an 33 x 33 image.
A 3 x 50 numeric mixing matrix of the true time series. Each row is a time series of corresponding column in smat
.
A 1089 x 50 numeric matrix of the simulated data. Each column is the simulated mixed signal at a time point.
data(example_sim123) str(example_sim123)
data(example_sim123) str(example_sim123)
This function computes subject-level principal components (PCs) from fMRI data and performs a group-level PCA for dimension reduction, designed for cortical surface fMRI data in BIDS format.
gen_groupPC( bids_path, subj_list, n.comp = 30, ncore = 1, npc = 85, iter_std = 5, brainstructures = c("left", "right"), verbose = TRUE )
gen_groupPC( bids_path, subj_list, n.comp = 30, ncore = 1, npc = 85, iter_std = 5, brainstructures = c("left", "right"), verbose = TRUE )
bids_path |
A character string specifying the root directory of the BIDS-formatted dataset. |
subj_list |
A named list generated from |
n.comp |
An integer specifying the number of components to retain during group-level PCA. Default is 30. |
ncore |
An integer specifying the number of cores to use for parallel processing. Default is 1. |
npc |
An integer specifying the number of components to retain during subject-level PCA. Default is 85. |
iter_std |
An integer specifying the number of iterative standardization steps to apply to fMRI data. Default is 5. |
brainstructures |
A character vector specifying the brain structures to include in the analysis. Options are |
verbose |
A logical value indicating whether to print convergence information during execution. Default is |
NOTE: This function requires the ciftiTools
package to be installed, and set up the path to the Connectome Workbench folder by ciftiTools.setOption()
. See the package ciftiTools
documentation for more information.
A numeric matrix containing the group-level principal components, with dimensions determined by the number of retained components (n.comp
) and the concatenated data across all subjects.
Function for generating random starting points
gen.inits(p, d, runs, orth.method = c("svd", "givens"))
gen.inits(p, d, runs, orth.method = c("svd", "givens"))
p |
The number of rows. |
d |
The number of columns. |
runs |
The number of random starts. |
orth.method |
The method used for generating initial values of U matrix. The default is "svd". |
A list of random initialization of matrices.
For a given angle theta, returns a d x d Givens rotation matrix
givens.rotation(theta = 0, d = 2, which = c(1, 2))
givens.rotation(theta = 0, d = 2, which = c(1, 2))
theta |
The value of theta. |
d |
The value of d. |
which |
The value of which. |
This function performs Sparse ICA on group-level fMRI data. It processes BIDS-formatted fMRI datasets, performs PCA to reduce dimensionality, selects a tuning parameter nu
(optionally using a BIC-like criterion), and executes Sparse ICA to estimate independent components.
group_sparseICA( bids_path, subj_list = NULL, nu = "BIC", n.comp = 30, method = "C", ncore = 1, npc = 85, iter_std = 5, brainstructures = c("left", "right"), restarts = 40, positive_skewness = TRUE, use_irlba = TRUE, eps = 1e-06, maxit = 500, BIC_plot = TRUE, nu_list = seq(0.1, 4, 0.05), verbose = TRUE, BIC_verbose = FALSE, converge_plot = FALSE )
group_sparseICA( bids_path, subj_list = NULL, nu = "BIC", n.comp = 30, method = "C", ncore = 1, npc = 85, iter_std = 5, brainstructures = c("left", "right"), restarts = 40, positive_skewness = TRUE, use_irlba = TRUE, eps = 1e-06, maxit = 500, BIC_plot = TRUE, nu_list = seq(0.1, 4, 0.05), verbose = TRUE, BIC_verbose = FALSE, converge_plot = FALSE )
bids_path |
A character string specifying the root directory of the BIDS-formatted dataset. |
subj_list |
A named list where each element corresponds to a subject and contains vectors of fMRI file names. If |
nu |
A numeric value for the tuning parameter, or |
n.comp |
An integer specifying the number of components to estimate. Default is 30. |
method |
A character string specifying the computation method for Sparse ICA. Options are |
ncore |
An integer specifying the number of cores to use for parallel processing. Default is 1. |
npc |
An integer specifying the number of components to retain during subject-level PCA. Default is 85. |
iter_std |
An integer specifying the number of iterative standardization steps to apply to fMRI data. Default is 5. |
brainstructures |
A character vector specifying the brain structures to include in the analysis. Options are |
restarts |
An integer specifying the number of random initializations for Sparse ICA. Default is 40. |
positive_skewness |
A logical value indicating whether to enforce positive skewness on the estimated components. Default is |
use_irlba |
A logical value indicating whether to use the |
eps |
A numeric value specifying the convergence threshold. Default is |
maxit |
An integer specifying the maximum number of iterations for Sparse ICA. Default is 500. |
BIC_plot |
A logical value indicating whether to generate a plot of BIC values for different |
nu_list |
A numeric vector specifying candidate values for |
verbose |
A logical value indicating whether to print progress messages. Default is |
BIC_verbose |
A logical value indicating whether to print detailed messages during the BIC-based selection of |
converge_plot |
A logical value indicating whether to generate a plot showing the convergence trace during Sparse ICA. Default is |
The function operates in four main steps:
If subj_list
is not provided, it creates a list of subject-specific fMRI files using create_group_list
.
Performs subject-level PCA using gen_groupPC
to reduce data dimensionality.
Selects the tuning parameter nu
using a BIC-like criterion (if nu = "BIC"
) or uses the provided nu
.
Executes Sparse ICA on the group-level PCs to estimate independent components.
A list containing the results of the group Sparse ICA analysis, including:
loglik
The minimal log-likelihood value among the random initializations.
estS
A numeric matrix of estimated sparse independent components with dimensions P x Q.
estU
The estimated U matrix with dimensions Q x Q.
whitener
The whitener matrix used for data whitening.
converge
The trace of convergence for the U matrix.
best_nu
The selected nu
value (if nu = "BIC"
).
BIC
A numeric vector of BIC values for each nu
candidate (if nu = "BIC"
).
nu_list
The list of nu
candidates used in the BIC-based selection (if nu = "BIC"
).
create_group_list
, gen_groupPC
, BIC_sparseICA
, sparseICA
Match ICA results based on L2 distances and Hungarian
matchICA(S, template, M = NULL)
matchICA(S, template, M = NULL)
S |
loading variable matrix |
template |
template for match |
M |
subject score matrix |
the match result
This function performs Sparse Independent Component Analysis (Sparse ICA), implemented in both pure R and RCpp for efficiency.
relax_and_split_ICA( xData, n.comp, nu = 1, U.list = NULL, whiten = c("eigenvec", "sqrtprec", "none"), lngca = FALSE, orth.method = c("svd", "givens"), method = c("C", "R"), restarts = 40, use_irlba = TRUE, eps = 1e-06, maxit = 500, verbose = FALSE, converge_plot = FALSE, col.stand = TRUE, row.stand = FALSE, iter.stand = 5, positive_skewness = TRUE )
relax_and_split_ICA( xData, n.comp, nu = 1, U.list = NULL, whiten = c("eigenvec", "sqrtprec", "none"), lngca = FALSE, orth.method = c("svd", "givens"), method = c("C", "R"), restarts = 40, use_irlba = TRUE, eps = 1e-06, maxit = 500, verbose = FALSE, converge_plot = FALSE, col.stand = TRUE, row.stand = FALSE, iter.stand = 5, positive_skewness = TRUE )
xData |
A numeric matrix of input data with dimensions P x T, where P is the number of features and T is the number of samples. |
n.comp |
An integer specifying the number of components to estimate. |
nu |
A numeric tuning parameter controlling the balance between accuracy and sparsity of the results. It can be selected using a BIC-like criterion or based on expert knowledge. Default is 1. |
U.list |
An optional matrix specifying the initialization of the U matrix. Default is |
whiten |
A character string specifying the method for whitening the input |
lngca |
A logical value indicating whether to perform Linear Non-Gaussian Component Analysis (LNGCA). Default is |
orth.method |
A character string specifying the method used for generating initial values for the U matrix. Default is |
method |
A character string specifying the computation method. If |
restarts |
An integer specifying the number of random initializations for optimization. Default is 40. |
use_irlba |
A logical value indicating whether to use the |
eps |
A numeric value specifying the convergence threshold. Default is |
maxit |
An integer specifying the maximum number of iterations for the Sparse ICA method using Laplace density. Default is 500. |
verbose |
A logical value indicating whether to print convergence information during execution. Default is |
converge_plot |
A logical value indicating whether to generate a line plot showing the convergence trace. Default is |
col.stand |
A logical value indicating whether to standardize columns. For each column, the mean of the entries in the column equals 0, and the variance of the entries in the column equals 1. Default is |
row.stand |
A logical value indicating whether to standardize rows. For each row, the mean of the entries in the row equals 0, and the variance of the entries in the row equals 1. Default is |
iter.stand |
An integer specifying the number of iterations for achieving both row and column standardization when |
positive_skewness |
A logical value indicating whether to enforce positive skewness on the estimated components. Default is |
A list containing the following elements:
loglik
The minimal log-likelihood value among the random initializations.
estS
A numeric matrix of estimated sparse independent components with dimensions P x Q.
estU
The estimated U matrix with dimensions Q x Q.
estM
The estimated mixing matrix with dimensions Q x T.
whitener
The whitener matrix used for data whitening.
converge
Convergence information for the U matrix.
Change the sign of S and M matrices to positive skewness.
signchange(S, M = NULL)
signchange(S, M = NULL)
S |
The S matrix with dimension P x Q. |
M |
The M matrix with dimension Q x T. |
A list of S and M matrices with positive skewness.
Soft-threshold function
soft_thresh_R(x, nu = 1, lambda = sqrt(2)/2)
soft_thresh_R(x, nu = 1, lambda = sqrt(2)/2)
x |
The input scalar. |
nu |
The tuning parameter. |
lambda |
The lambda parameter of the Laplace density. |
This function performs Sparse Independent Component Analysis (Sparse ICA), implemented in both pure R and RCpp for efficiency.
sparseICA( xData, n.comp, nu = "BIC", nu_list = seq(0.1, 4, 0.1), U.list = NULL, whiten = c("eigenvec", "sqrtprec", "none"), lngca = FALSE, orth.method = c("svd", "givens"), method = c("C", "R"), restarts = 40, use_irlba = TRUE, eps = 1e-06, maxit = 500, verbose = TRUE, BIC_verbose = FALSE, converge_plot = FALSE, col.stand = TRUE, row.stand = FALSE, iter.stand = 5, positive_skewness = TRUE )
sparseICA( xData, n.comp, nu = "BIC", nu_list = seq(0.1, 4, 0.1), U.list = NULL, whiten = c("eigenvec", "sqrtprec", "none"), lngca = FALSE, orth.method = c("svd", "givens"), method = c("C", "R"), restarts = 40, use_irlba = TRUE, eps = 1e-06, maxit = 500, verbose = TRUE, BIC_verbose = FALSE, converge_plot = FALSE, col.stand = TRUE, row.stand = FALSE, iter.stand = 5, positive_skewness = TRUE )
xData |
A numeric matrix of input data with dimensions P x T, where P is the number of features and T is the number of samples. |
n.comp |
An integer specifying the number of components to estimate. |
nu |
A positive numeric value or a character "BIC" specifying the tuning parameter controlling the balance between accuracy and sparsity of the results. It can be selected using a BIC-like criterion ( |
nu_list |
A numeric vector specifying the list of candidate tuning parameters. Default is |
U.list |
An optional matrix specifying the initialization of the U matrix. Default is |
whiten |
A character string specifying the method for whitening the input |
lngca |
A logical value indicating whether to perform Linear Non-Gaussian Component Analysis (LNGCA). Default is |
orth.method |
A character string specifying the method used for generating initial values for the U matrix. Default is |
method |
A character string specifying the computation method. If |
restarts |
An integer specifying the number of random initializations for optimization. Default is 40. |
use_irlba |
A logical value indicating whether to use the |
eps |
A numeric value specifying the convergence threshold. Default is |
maxit |
An integer specifying the maximum number of iterations for the Sparse ICA method using Laplace density. Default is 500. |
verbose |
A logical value indicating whether to print convergence information during execution. Default is |
BIC_verbose |
A logical value indicating whether to print BIC selection information. Default is |
converge_plot |
A logical value indicating whether to generate a line plot showing the convergence trace. Default is |
col.stand |
A logical value indicating whether to standardize columns. For each column, the mean of the entries in the column equals 0, and the variance of the entries in the column equals 1. Default is |
row.stand |
A logical value indicating whether to standardize rows. For each row, the mean of the entries in the row equals 0, and the variance of the entries in the row equals 1. Default is |
iter.stand |
An integer specifying the number of iterations for achieving both row and column standardization when |
positive_skewness |
A logical value indicating whether to enforce positive skewness on the estimated components. Default is |
A list containing the following elements:
loglik
The minimal log-likelihood value among the random initializations.
estS
A numeric matrix of estimated sparse independent components with dimensions P x Q.
estM
The estimated mixing matrix with dimensions Q x T.
estU
The estimated U matrix with dimensions Q x Q.
whitener
The whitener matrix used for data whitening.
converge
The trace of convergence for the U matrix.
BIC
A numeric vector of BIC values corresponding to each candidate nu
in nu_list
.
nu_list
A numeric vector of candidate tuning parameter values.
best_nu
The optimal nu
selected based on the BIC-like criterion.
#get simulated data data(example_sim123) my_sparseICA <- sparseICA(xData = example_sim123$xmat, n.comp = 3, nu = "BIC", method="C", restarts = 40, eps = 1e-6, maxit = 500, verbose=TRUE) res_matched <- matchICA(my_sparseICA$estS,example_sim123$smat) # Visualize the estimated components oldpar <- par()$mfrow par(mfrow=c(1,3)) image(matrix(res_matched[,1],33,33)) image(matrix(res_matched[,2],33,33)) image(matrix(res_matched[,3],33,33)) par(mfrow=oldpar)
#get simulated data data(example_sim123) my_sparseICA <- sparseICA(xData = example_sim123$xmat, n.comp = 3, nu = "BIC", method="C", restarts = 40, eps = 1e-6, maxit = 500, verbose=TRUE) res_matched <- matchICA(my_sparseICA$estS,example_sim123$smat) # Visualize the estimated components oldpar <- par()$mfrow par(mfrow=c(1,3)) image(matrix(res_matched[,1],33,33)) image(matrix(res_matched[,2],33,33)) image(matrix(res_matched[,3],33,33)) par(mfrow=oldpar)
Convert angle vector into orthodox matrix
theta2W(theta)
theta2W(theta)
theta |
A vector of angles theta. |
An orthodox matrix.
The function for perform whitening.
whitener(X, n.comp = ncol(X), center.row = FALSE, use_irlba = TRUE)
whitener(X, n.comp = ncol(X), center.row = FALSE, use_irlba = TRUE)
X |
The data matrix with dimension P x T. |
n.comp |
The number of components. |
center.row |
Whether to center the row of data. Default is FALSE. |
use_irlba |
Whether to use the irlba method to perform fast truncated singular value decomposition in whitening step, helpful for memorying intermediate dataset. Default is TRUE. |
A list including the whitener matrix, the whitened data matrix, and the mean of the input data.