Title: | A Functional Data Analysis Package for Spatial Single Cell Data |
---|---|
Description: | Methods and tools for deriving spatial summary functions from single-cell imaging data and performing functional data analyses. Functions can be applied to other single-cell technologies such as spatial transcriptomics. Functional regression and functional principal component analysis methods are in the 'refund' package <https://cran.r-project.org/package=refund> while calculation of the spatial summary functions are from the 'spatstat' package <https://spatstat.org/>. |
Authors: | Julia Wrobel [aut] , Alex Soupir [aut, cre] |
Maintainer: | Alex Soupir <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.2.2 |
Built: | 2024-12-07 06:36:09 UTC |
Source: | CRAN |
Sometimes other ways of calculating summary functions is wanted and is done in other packages,
in this instance the data can be loaded into the mxFDA
object.
add_summary_function(mxFDAobject, summary_function_data, metric)
add_summary_function(mxFDAobject, summary_function_data, metric)
mxFDAobject |
object of class |
summary_function_data |
data frame with |
metric |
character vector with either 'uni' or 'bi' and 'k', 'l', or 'g'; e.g. 'uni g' |
an updated mxFDA
object with a derived value added. See make_mxfda()
for more details.
Alex Soupir [email protected]
Internal function called by extract_summary_functions
to calculate a bivariate spatial summary function for a single image.
bivariate( mximg, markvar, mark1, mark2, r_vec, func = c(Kcross, Lcross, Gcross, entropy), edge_correction, empirical_CSR = FALSE, permutations = 1000 )
bivariate( mximg, markvar, mark1, mark2, r_vec, func = c(Kcross, Lcross, Gcross, entropy), edge_correction, empirical_CSR = FALSE, permutations = 1000 )
mximg |
Dataframe of cell-level multiplex imaging data for a single image.
Should have variables |
markvar |
The name of the variable that denotes cell type(s) of interest. Character. |
mark1 |
Character string that denotes first cell type of interest. |
mark2 |
Character string that denotes second cell type of interest. |
r_vec |
Numeric vector of radii over which to evaluate spatial summary functions. Must begin at 0. |
func |
Spatial summary function to calculate. Options are c(Kcross, Lcross, Gcross) which denote Ripley's K, Besag's L, and nearest neighbor G function, respectively, or entropy from Vu et al, 2023. |
edge_correction |
Character string that denotes the edge correction method for spatial summary function. For Kcross and Lcross choose one of c("border", "isotropic", "Ripley", "translate", "none"). For Gcross choose one of c("rs", "km", "han") |
empirical_CSR |
logical to indicate whether to use the permutations to identify the sample-specific complete spatial randomness (CSR) estimation. |
permutations |
integer for the number of permtuations to use if empirical_CSR is |
A data.frame
containing:
r |
the radius of values over which the spatial summary function is evaluated |
sumfun |
the values of the spatial summary function |
csr |
the values of the spatial summary function under complete spatial randomness |
fundiff |
sumfun - csr, positive values indicate clustering and negative values repulsion |
Julia Wrobel [email protected]
Alex Soupir [email protected]
Xiao, L., Ruppert, D., Zipunnikov, V., and Crainiceanu, C. (2016). Fast covariance estimation for high-dimensional functional data. Statistics and Computing, 26, 409-421. DOI: 10.1007/s11222-014-9485-x.
Vu, T., Seal, S., Ghosh, T., Ahmadian, M., Wrobel, J., & Ghosh, D. (2023). FunSpace: A functional and spatial analytic approach to cell imaging data using entropy measures. PLOS Computational Biology, 19(9), e1011490.
Creed, J. H., Wilson, C. M., Soupir, A. C., Colin-Leitzinger, C. M., Kimmel, G. J., Ospina, O. E., Chakiryan, N. H., Markowitz, J., Peres, L. C., Coghill, A., & Fridley, B. L. (2021). spatialTIME and iTIME: R package and Shiny application for visualization and analysis of immunofluorescence data. Bioinformatics (Oxford, England), 37(23), 4584–4586. https://doi.org/10.1093/bioinformatics/btab757
Entropy
entropy(df, r_vec, markvar)
entropy(df, r_vec, markvar)
df |
data frame with x and y columns, along with a column for point marks |
r_vec |
vector of length wanted for breaks (will be rescaled) with max value at max for measuring entropy |
markvar |
The name of the variable that denotes cell type(s) of interest. Character. |
data frame with entropy calculated for length(r_vec)
bins within 0 to max(r_vec)
Thao Vu [email protected]
Alex Soupir [email protected]
Vu, T., Seal, S., Ghosh, T., Ahmadian, M., Wrobel, J., & Ghosh, D. (2023). FunSpace: A functional and spatial analytic approach to cell imaging data using entropy measures. PLOS Computational Biology, 19(9), e1011490.
Altieri, L., Cocchi, D., & Roli, G. (2018). A new approach to spatial entropy measures. Environmental and ecological statistics, 25, 95-110.
The extract_entropy() is used to compute spatial entropy at each distance interval for all cell types of interest. The goal is to capture the diversity in cellular composition, such as similar proportions across cell types or dominance of a single type, at a specific distance range. Additionally, spatial patterns, including clustered, independent, or regular, among cell types can also be acquired. In this example, we will look at the spatial heterogeneity across T cells, macrophages, and others. To focus on the local cell-to-cell interactions, we set the default maximum of the distance range (i.e., rmax) to be 400 microns. The default number of distance breaks/intervals is set to 50. Then, a sequence of distance breaks is generated by linearly decreasing from rmax to 0 on a log scale. At each distance range, partial spatial entropy and residual entropy are calculated as in Vu et al. (2023), Altieri et al. (2018). These spatial entropy functions can then be used as input functions for FPCA.
extract_entropy(mxFDAobject, markvar, marks, n_break = 50, rmax = 400)
extract_entropy(mxFDAobject, markvar, marks, n_break = 50, rmax = 400)
mxFDAobject |
object of class |
markvar |
The name of the variable that denotes cell type(s) of interest. Character. |
marks |
Character vector that denotes cell types of interest. |
n_break |
Total number of distance ranges/intervals of interest made from 0 to |
rmax |
Max distance between pairs of cells |
object of class mxFDA
with a dataframe in the multivariate_summaries
slot
Function that extracts the FPCA object created either by run_fpca()
or run_mfpca()
from the mxFDA
object
extract_fpca_object(mxFDAobject, what)
extract_fpca_object(mxFDAobject, what)
mxFDAobject |
object of class |
what |
what functional PCA data to extract, e.g. 'uni k' |
Output object can be visualized with refund.shiny::plot_shiny()
fpca
object created with run_fcm()
Alex Soupir [email protected]
#load ovarian mxFDA object data('ovarian_FDA') #run the FPCA ovarian_FDA = run_fpca(ovarian_FDA, metric = "uni g", r = "r", value = "fundiff", lightweight = TRUE, pve = .99) #extract the fpca object obj = extract_fpca_object(ovarian_FDA, "uni g fpca")
#load ovarian mxFDA object data('ovarian_FDA') #run the FPCA ovarian_FDA = run_fpca(ovarian_FDA, metric = "uni g", r = "r", value = "fundiff", lightweight = TRUE, pve = .99) #extract the fpca object obj = extract_fpca_object(ovarian_FDA, "uni g fpca")
Extract FPCA scores
extract_fpca_scores(mxFDAobject, what)
extract_fpca_scores(mxFDAobject, what)
mxFDAobject |
object of class |
what |
what functional PCA data to extract, e.g. 'uni k' |
fpca object
Alex Soupir [email protected]
#load ovarian mxFDA object data('ovarian_FDA') #run ghe lfcm model ovarian_FDA = run_fpca(ovarian_FDA, metric = "uni g", r = "r", value = "fundiff", analysis_vars = c("age", "survival_time")) #extract uni fpc scores fpc = extract_fpca_scores(ovarian_FDA, 'uni g fpca')
#load ovarian mxFDA object data('ovarian_FDA') #run ghe lfcm model ovarian_FDA = run_fpca(ovarian_FDA, metric = "uni g", r = "r", value = "fundiff", analysis_vars = c("age", "survival_time")) #extract uni fpc scores fpc = extract_fpca_scores(ovarian_FDA, 'uni g fpca')
Currently only extracts functional cox models not mixed functional cox models.
extract_model(mxFDAobject, metric, type, model_name)
extract_model(mxFDAobject, metric, type, model_name)
mxFDAobject |
object of class |
metric |
metric functional PCA data to extract, e.g. 'uni k' |
type |
one of "cox", "mcox", or "sofr" to specify the type of model to extract |
model_name |
character string of the model name to retrieve |
fit functional model
Alex Soupir [email protected]
#load ovarian mxFDA object data('ovarian_FDA') #run the lfcm model ovarian_FDA = run_fcm(ovarian_FDA, model_name = "fit_lfcm", formula = survival_time ~ age, event = "event", metric = "uni g", r = "r", value = "fundiff", analysis_vars = c("age", "survival_time"), afcm = FALSE) #extract model mod = extract_model(ovarian_FDA, 'uni g', 'cox', 'fit_lfcm')
#load ovarian mxFDA object data('ovarian_FDA') #run the lfcm model ovarian_FDA = run_fcm(ovarian_FDA, model_name = "fit_lfcm", formula = survival_time ~ age, event = "event", metric = "uni g", r = "r", value = "fundiff", analysis_vars = c("age", "survival_time"), afcm = FALSE) #extract model mod = extract_model(ovarian_FDA, 'uni g', 'cox', 'fit_lfcm')
Summarise spatial data in mxFDA object
extract_spatial_summary(mxFDAobject, columns, grouping_columns = NULL)
extract_spatial_summary(mxFDAobject, columns, grouping_columns = NULL)
mxFDAobject |
object of class |
columns |
character vector for column heading for cells to summarise |
grouping_columns |
character vector of other columns to use as grouping, such as region classification column |
Currently this function is experimental as it only handles data that has text in the columns. Eventually, will be able to handle any data inputs such as those from HALO where cells are designated as positive (1) or negative (0) for a cell phenotypes.
data frame with percent of total points per spatial sample columns
.
If multiple levels are present in columns
columns, multiple output columns will be provided.
Alex Soupir [email protected]
#load data data(lung_df) #create data frames for `mxFDA` object clinical = lung_df %>% dplyr::select(image_id, patient_id, patientImage_id, gender, age, survival_days, survival_status, stage) %>% dplyr::distinct() #make small, just need to make sure it runs spatial = lung_df %>% dplyr::select(-image_id, -gender, -age, -survival_days, -survival_status, -stage) %>% dplyr::filter(patientImage_id %in% clinical$patientImage_id[1:10]) #create `mxFDA` object mxFDAobject = make_mxfda(metadata = clinical, spatial = spatial, subject_key = "patient_id", sample_key = "patientImage_id") #get markers markers = colnames(mxFDAobject@Spatial) %>% grep("pheno", ., value = TRUE) #extract summary df = extract_spatial_summary(mxFDAobject, markers)
#load data data(lung_df) #create data frames for `mxFDA` object clinical = lung_df %>% dplyr::select(image_id, patient_id, patientImage_id, gender, age, survival_days, survival_status, stage) %>% dplyr::distinct() #make small, just need to make sure it runs spatial = lung_df %>% dplyr::select(-image_id, -gender, -age, -survival_days, -survival_status, -stage) %>% dplyr::filter(patientImage_id %in% clinical$patientImage_id[1:10]) #create `mxFDA` object mxFDAobject = make_mxfda(metadata = clinical, spatial = spatial, subject_key = "patient_id", sample_key = "patientImage_id") #get markers markers = colnames(mxFDAobject@Spatial) %>% grep("pheno", ., value = TRUE) #extract summary df = extract_spatial_summary(mxFDAobject, markers)
Function to extract spatial summary functions from the Spatial
slot of an mxFDA
object
extract_summary_functions( mxFDAobject, r_vec = seq(0, 100, by = 10), extract_func = c(univariate, bivariate), summary_func = c(Kest, Lest, Gest), markvar, mark1, mark2 = NULL, edge_correction, empirical_CSR = FALSE, permutations = 1000 )
extract_summary_functions( mxFDAobject, r_vec = seq(0, 100, by = 10), extract_func = c(univariate, bivariate), summary_func = c(Kest, Lest, Gest), markvar, mark1, mark2 = NULL, edge_correction, empirical_CSR = FALSE, permutations = 1000 )
mxFDAobject |
object of class |
r_vec |
Numeric vector of radii over which to evaluate spatial summary functions. Must begin at 0. |
extract_func |
Defaults to univariate, which calculates univariate spatial summary functions. Choose bivariate for bivariate spatial summary functions. |
summary_func |
Spatial summary function to calculate. Options are c(Kest, Lest, Gest) which denote Ripley's K, Besag's L, and nearest neighbor G function, respectively. |
markvar |
The name of the variable that denotes cell type(s) of interest. Character. |
mark1 |
Character string that denotes first cell type of interest. |
mark2 |
Character string that denotes second cell type of interest for calculating bivariate summary statistics. Not used when calculating univariate statistics. |
edge_correction |
Character string that denotes the edge correction method for spatial summary function. For Kest and Lest choose one of c("border", "isotropic", "Ripley", "translate", "none"). For Gest choose one of c("rs", "km", "han") |
empirical_CSR |
logical to indicate whether to use the permutations to identify the sample-specific complete spatial randomness (CSR) estimation. If there are not enough levels present in |
permutations |
integer for the number of permtuations to use if empirical_CSR is |
Complete spatial randomness (CSR) is the estimation or measure of a spatial summary function when the points or cells in a sample are randomly distributed, following no clustering or dispersion pattern. Some samples do have artifacts that may influence what CSR is under the distribution of points as they are found in the sample such as large regions of missing points or possibly in the case of tissue sections, necrotic tissue where cells are dead. Theoretical CSR requires points have an equal chance of occurring anywhere in the sample that these artifacts violate, necessitating the need to estimate or calculate what this CSR would be for each sample independently. Previously Wilson et al. had demonstrated cases in which sample-specific CSR was important over the use of the theoretical in calculating how much the observed deviates from expected.
an object of class mxFDA
containing the corresponding spatial summary function slot filled. See make_mxfda()
for object structure details.
Julia Wrobel [email protected]
Alex Soupir [email protected]
Xiao, L., Ruppert, D., Zipunnikov, V., and Crainiceanu, C. (2016). Fast covariance estimation for high-dimensional functional data. Statistics and Computing, 26, 409-421. DOI: 10.1007/s11222-014-9485-x.
Wilson, C., Soupir, A. C., Thapa, R., Creed, J., Nguyen, J., Segura, C. M., Gerke, T., Schildkraut, J. M., Peres, L. C., & Fridley, B. L. (2022). Tumor immune cell clustering and its association with survival in African American women with ovarian cancer. PLoS computational biology, 18(3), e1009900. https://doi.org/10.1371/journal.pcbi.1009900
Creed, J. H., Wilson, C. M., Soupir, A. C., Colin-Leitzinger, C. M., Kimmel, G. J., Ospina, O. E., Chakiryan, N. H., Markowitz, J., Peres, L. C., Coghill, A., & Fridley, B. L. (2021). spatialTIME and iTIME: R package and Shiny application for visualization and analysis of immunofluorescence data. Bioinformatics (Oxford, England), 37(23), 4584–4586. https://doi.org/10.1093/bioinformatics/btab757
#load ovarian FDA object data('ovarian_FDA') #run function ovarian_FDA = extract_summary_functions(ovarian_FDA, r_vec = 0:100, extract_func = univariate, summary_func = Gest, markvar = "immune", mark1 = "immune", edge_correction = "rs")
#load ovarian FDA object data('ovarian_FDA') #run function ovarian_FDA = extract_summary_functions(ovarian_FDA, r_vec = 0:100, extract_func = univariate, summary_func = Gest, markvar = "immune", mark1 = "immune", edge_correction = "rs")
Function that transforms functional models from linear or additive functional cox models into afcmSurface
or lfcmSurface
objects to be plotted.
extract_surface( mxFDAobject, metric, model = NULL, r = "r", value = "fundiff", grid_length = 100, analysis_vars, p = 0.05, filter_cols = NULL )
extract_surface( mxFDAobject, metric, model = NULL, r = "r", value = "fundiff", grid_length = 100, analysis_vars, p = 0.05, filter_cols = NULL )
mxFDAobject |
object of class |
metric |
spatial summary function to extract surface for |
model |
character string for the name of the model for |
r |
Character string, the name of the variable that identifies the function domain (usually a radius for spatial summary functions). Default is "r". |
value |
Character string, the name of the variable that identifies the spatial summary function values. Default is "fundiff". |
grid_length |
Length of grid on which to evaluate coefficient functions. |
analysis_vars |
Other variables used in modeling FCM fit. |
p |
numeric p-value used for predicting significant AFCM surface |
filter_cols |
a named vector of factors to filter summary functions to in |
a 4 element list of either class lfcmSurface
or afcmSurface
depending on the class of model
Surface |
|
Prediction |
|
Metric |
character of the spatial summary function used; helps keep track if running many models |
P-value |
a numeric value of the input p-value |
Julia Wrobel [email protected]
Alex Soupir [email protected]
#load ovarian mxFDA object data('ovarian_FDA') #run the lfcm model ovarian_FDA = run_fcm(ovarian_FDA, model_name = "fit_lfcm", formula = survival_time ~ age, event = "event", metric = "uni g", r = "r", value = "fundiff", analysis_vars = c("age", "survival_time"), afcm = FALSE) #extract surface model_surface = extract_surface(ovarian_FDA, metric = 'uni g', model = 'fit_lfcm', analysis_vars = 'age') #variables in model
#load ovarian mxFDA object data('ovarian_FDA') #run the lfcm model ovarian_FDA = run_fcm(ovarian_FDA, model_name = "fit_lfcm", formula = survival_time ~ age, event = "event", metric = "uni g", r = "r", value = "fundiff", analysis_vars = c("age", "survival_time"), afcm = FALSE) #extract surface model_surface = extract_surface(ovarian_FDA, metric = 'uni g', model = 'fit_lfcm', analysis_vars = 'age') #variables in model
function to filter the spatial data slot of the mxFDA
object.
filter_spatial(mxFDAobject, ..., based_on = "meta", force = FALSE)
filter_spatial(mxFDAobject, ..., based_on = "meta", force = FALSE)
mxFDAobject |
object of class |
... |
expressions that return a logical TRUE/FALSE value when evaluated on columns of the meta data slot. These expressions get passed to |
based_on |
character for which data slot to use for filtering, either 'meta', or 'spatial'. Default to 'meta'. |
force |
logical whether or not to return empty spatial data if filtering results in 0 rows |
object of class mxFDA
with the spatial slot filtered. See make_mxfda()
for more details on object
Alex Soupir [email protected]
#load ovarian mxFDA object data(ovarian_FDA) #filter ages greater than 50 ovarian_FDA_age50 = filter_spatial(ovarian_FDA, age >= 50, based_on = 'meta')
#load ovarian mxFDA object data(ovarian_FDA) #filter ages greater than 50 ovarian_FDA_age50 = filter_spatial(ovarian_FDA, age >= 50, based_on = 'meta')
This data is adapted from the VectraPolarisData Bioconductor package. There are multiple ROIs for each patient. Data was filtered to include only the cells in the tumor compartment.
lung_df
lung_df
lung_df
A data frame with 879,694 rows and 19 columns:
Image id for a given patient
Unique patient id
Patient age at time of cancer diagnosis
Survival time from diagnosis, in days
Censoring variable, 1 = death, 0 = censor
Cell x position
Cell y position
...
https://bioconductor.org/packages/release/data/experiment/html/VectraPolarisData.html
This data is adapted from the VectraPolarisData Bioconductor package. There are multiple ROIs for each patient.
lung_FDA
lung_FDA
lung_FDA
An mxFDA object with augmented non-small cel lung cancer multiplex immunofluorescence data, and NN G(r) calculated:
information about the spatial samples with column sample_key
column in both
cell-level information with x
and y
columns along with sample_key
to link to Metadata
column in Metadata
that may have multiple sample_key
values for each, akin to patient IDs
column in both Metadata
and Spatial
that is a 1:1 with the samples (unique per sample)
univariate summary slot with nearest neighbor G calculared
empty slot available for bivariate summaries
empty slot available for multivariate summaries
empty slot for functional PCA data of summaries
empty slot for functional models
Spatial summary functions of lung cancer multiplex imaging data.
This data is adapted from the VectraPolarisData Bioconductor package. Signal between the survival outcome and spatial summary functions has been augmented for teaching purposes. Spatial relationship is summarized using the nearest neighbor G function.
Includes only spatial samples that had 10 or more radii with calculable G function
https://bioconductor.org/packages/release/data/experiment/html/VectraPolarisData.html
Used to create an object of class mxFDA
that can be used with the mxfda package for functional data analysis.
make_mxfda(metadata, spatial = NULL, subject_key, sample_key)
make_mxfda(metadata, spatial = NULL, subject_key, sample_key)
metadata |
metadata with columns |
spatial |
spatial information, either list or df, with column |
subject_key |
column name in |
sample_key |
column linking |
S4 object of class mxFDA
Metadata |
slot of class |
Spatial |
slot of class |
subject_key |
slot of class |
sample_key |
slot of class |
univariate_summaries |
slot of class |
bivariate_summaries |
slot of class |
multiivariate_summaries |
slot of class |
functional_pca |
slot of class |
functional_mpca |
slot of class |
functional_cox |
slot of class |
functional_mcox |
slot of class |
scalar_on_function |
slot of class |
Alex Soupir [email protected]
#select sample metadata clinical = lung_df %>% dplyr::select(image_id, patient_id, patientImage_id, gender, age, survival_days, survival_status, stage) %>% dplyr::distinct() #select the spatial information spatial = lung_df %>% dplyr::select(-image_id, -gender, -age, -survival_days, -survival_status, -stage) sample_id_column = "patientImage_id" #create the mxFDA object mxFDAobject = make_mxfda(metadata = clinical, spatial = spatial, subject_key = "patient_id", sample_key = sample_id_column)
#select sample metadata clinical = lung_df %>% dplyr::select(image_id, patient_id, patientImage_id, gender, age, survival_days, survival_status, stage) %>% dplyr::distinct() #select the spatial information spatial = lung_df %>% dplyr::select(-image_id, -gender, -age, -survival_days, -survival_status, -stage) sample_id_column = "patientImage_id" #create the mxFDA object mxFDAobject = make_mxfda(metadata = clinical, spatial = spatial, subject_key = "patient_id", sample_key = sample_id_column)
This data is adapted from the VectraPolarisData Bioconductor package and comes from a tumor-microarray of tissue samples from 128 patients with ovarian cancer. There is one patient per subject.
ovarian_FDA
ovarian_FDA
ovarian_FDA
An mxFDA object with augmented ovarian cancer multiplex immunofluorescence data, and NN G(r) calculated:
information about the spatial samples with column sample_key
column in both
cell-level information with x
and y
columns along with sample_key
to link to Metadata
column in Metadata
that may have multiple sample_key
values for each, akin to patient IDs
column in both Metadata
and Spatial
that is a 1:1 with the samples (unique per sample)
univariate summary slot with nearest neighbor G calculared
empty slot available for bivariate summaries
empty slot available for multivariate summaries
empty slot for functional PCA data of summaries
empty slot for functional models
Spatial summary functions of ovarian cancer multiplex imaging data.
This data is adapted from the VectraPolarisData Bioconductor package. Signal between the survival outcome and spatial summary functions has been augmented for teaching purposes. Spatial relationship is summarized using the nearest neighbor G function.
https://bioconductor.org/packages/release/data/experiment/html/VectraPolarisData.html
Produces a ggplot with mean plus or minus two standard deviations of a selected FPC.
plot_fpc(obj, pc_choice)
plot_fpc(obj, pc_choice)
obj |
fpca object to be plotted. |
pc_choice |
FPC to be plotted. |
object of class ggplot
Julia Wrobel [email protected]
Produces a ggplot with mean plus or minus two standard deviations of a selected FPC.
plot_mfpc(obj, pc_choice_level1, pc_choice_level2)
plot_mfpc(obj, pc_choice_level1, pc_choice_level2)
obj |
fpca object to be plotted. |
pc_choice_level1 , pc_choice_level2
|
FPC to be plotted. |
list of objects of class ggplot
Julia Wrobel [email protected]
Plot afcm object
## S3 method for class 'afcmSurface' plot(x, ...)
## S3 method for class 'afcmSurface' plot(x, ...)
x |
object of class |
... |
currently ignored |
object compatable with ggplot2
Julia Wrobel [email protected]
Alex Soupir [email protected]
Plot lfcm surface
## S3 method for class 'lfcmSurface' plot(x, ...)
## S3 method for class 'lfcmSurface' plot(x, ...)
x |
object of class |
... |
currently ignored |
object compatable with ggplot2
Julia Wrobel [email protected]
Alex Soupir [email protected]
Plot mxFDA object
## S3 method for class 'mxFDA' plot(x, filter_cols = NULL, ...)
## S3 method for class 'mxFDA' plot(x, filter_cols = NULL, ...)
x |
object of class |
filter_cols |
column key to filter |
... |
additional paramters including |
If there are multiple metrics that are included in the derived table, an extra parameter filter_cols
in the format of c(Derived_Column = "Level_to_Filter")
will return curves from the Derived_Column
with the level Level_to_Filter
When plotting mFPCA objects, additional arguments level1
and level2
help indicate which FPCA from level 1 and level 2 to plot
object of class ggplot
compatible the ggplot2
aesthetics
Alex Soupir [email protected]
#set seed set.seed(333) #plotting summary data("ovarian_FDA") plot(ovarian_FDA, y = 'fundiff', what = 'uni g') #running fpca ovarian_FDA = run_fpca(ovarian_FDA, metric = "uni g", r = "r", value = "fundiff", lightweight = TRUE, pve = .99) #plot fpca plot(ovarian_FDA, what = 'uni g fpca', pc_choice = 1)
#set seed set.seed(333) #plotting summary data("ovarian_FDA") plot(ovarian_FDA, y = 'fundiff', what = 'uni g') #running fpca ovarian_FDA = run_fpca(ovarian_FDA, metric = "uni g", r = "r", value = "fundiff", lightweight = TRUE, pve = .99) #plot fpca plot(ovarian_FDA, what = 'uni g fpca', pc_choice = 1)
Plot sofr object
## S3 method for class 'sofr' plot(x, ...)
## S3 method for class 'sofr' plot(x, ...)
x |
object of class |
... |
currently ignored |
object compatable with ggplot2
Julia Wrobel [email protected]
Alex Soupir [email protected]
Fit a functional Cox regression model.
run_fcm( mxFDAobject, model_name, formula, event = "event", metric = "uni k", r = "r", value = "fundiff", afcm = FALSE, smooth = FALSE, filter_cols = NULL, ..., knots = NULL )
run_fcm( mxFDAobject, model_name, formula, event = "event", metric = "uni k", r = "r", value = "fundiff", afcm = FALSE, smooth = FALSE, filter_cols = NULL, ..., knots = NULL )
mxFDAobject |
Dataframe of spatial summary functions from multiplex imaging data, in long format. Can be estimated using the function |
model_name |
character string to give the fit model in the functional cox slot |
formula |
Formula to be fed to mgcv in the form of survival_time ~ x1 + x2. Does not contain functional predictor. Character valued. Data must contain censoring variable called "event". |
event |
character string for the column in Metadata that contains 1/0 for the survival event |
metric |
name of calculated spatial metric to use |
r |
Character string, the name of the variable that identifies the function domain (usually a radius for spatial summary functions). Default is "r". |
value |
Character string, the name of the variable that identifies the spatial summary function values. Default is "fundiff". |
afcm |
If TRUE, runs additive functional Cox model. If FALSE, runs linear functional cox model. Defaults to linear functional cox model. |
smooth |
Option to smooth data using FPCA. Defaults to FALSE. |
filter_cols |
a named vector of factors to filter summary functions to in |
... |
Optional other arguments to be passed to |
knots |
Number of knots for defining spline basis. |
A list
which is a linear or additive functional Cox model fit. See mgcv::gam
for more details.
Julia Wrobel [email protected]
Alex Soupir [email protected]
#load ovarian mxFDA object data('ovarian_FDA') #run the lfcm model ovarian_FDA = run_fcm(ovarian_FDA, model_name = "fit_lfcm", formula = survival_time ~ age, event = "event", metric = "uni g", r = "r", value = "fundiff", afcm = FALSE)
#load ovarian mxFDA object data('ovarian_FDA') #run the lfcm model ovarian_FDA = run_fcm(ovarian_FDA, model_name = "fit_lfcm", formula = survival_time ~ age, event = "event", metric = "uni g", r = "r", value = "fundiff", afcm = FALSE)
This is a wrapper for the function fpca.face
from the refund
package. EXPAND
run_fpca( mxFDAobject, metric = "uni k", r = "r", value = "fundiff", knots = NULL, analysis_vars = NULL, lightweight = FALSE, filter_cols = NULL, ... )
run_fpca( mxFDAobject, metric = "uni k", r = "r", value = "fundiff", knots = NULL, analysis_vars = NULL, lightweight = FALSE, filter_cols = NULL, ... )
mxFDAobject |
object of class |
metric |
name of calculated spatial metric to use |
r |
Character string, the name of the variable that identifies the function domain (usually a radius for spatial summary functions). Default is "r". |
value |
Character string, the name of the variable that identifies the spatial summary function values. Default is "fundiff". |
knots |
Number of knots for defining spline basis.Defaults to the number of measurements per function divided by 2. |
analysis_vars |
Optional list of variables to be retained for downstream analysis. |
lightweight |
Default is FALSE. If TRUE, removes Y and Yhat from returned FPCA object. A good option to select for large datasets. |
filter_cols |
a named vector of factors to filter summary functions to in |
... |
Optional other arguments to be passed to |
The filter_cols
parameter is useful when the summary function was input by the user using add_summary_function()
and the multiple marks were assessed; a column called "Markers" with tumor infiltrating lymphocytes as well as cytotoxic T cells. This parameter allows for filtering down to include only one or the other.
A mxFDA
object with the functional_pca
slot filled for the respective spatial summary function containing:
mxfundata |
The original dataframe of spatial summary functions, with scores from FPCA appended for downstream modeling |
fpc_object |
A list of class "fpca" with elements described in the documentation for |
Julia Wrobel [email protected]
Alex Soupir [email protected]
Xiao, L., Ruppert, D., Zipunnikov, V., and Crainiceanu, C. (2016). Fast covariance estimation for high-dimensional functional data. Statistics and Computing, 26, 409-421. DOI: 10.1007/s11222-014-9485-x.
#load ovarian mxFDA object data('ovarian_FDA') #run the FPCA ovarian_FDA = run_fpca(ovarian_FDA, metric = "uni g", r = "r", value = "fundiff", lightweight = TRUE, pve = .99)
#load ovarian mxFDA object data('ovarian_FDA') #run the FPCA ovarian_FDA = run_fpca(ovarian_FDA, metric = "uni g", r = "r", value = "fundiff", lightweight = TRUE, pve = .99)
Fit a functional Cox regression model when there are multiple functions per subject, which arise from multiple samples per subject. It is not necessary for all subjects to have the same number of samples.The function first performs a multilevel functional principal components analysis (MFPCA) decomposition to the spatial summary function. Then, the average curve for each subject is used in a functional Cox model (FCM). Variation around each subject's mean is captured by calculating the standard deviation of the level 2 scores from MFPCA, then including this as a scalar variable in the FCM called "level2_score_sd".
run_mfcm( mxFDAobject, model_name, formula, event = "event", metric = "uni k", r = "r", value = "fundiff", afcm = FALSE, filter_cols = NULL, pve = 0.99, ..., knots = NULL )
run_mfcm( mxFDAobject, model_name, formula, event = "event", metric = "uni k", r = "r", value = "fundiff", afcm = FALSE, filter_cols = NULL, pve = 0.99, ..., knots = NULL )
mxFDAobject |
Dataframe of spatial summary functions from multiplex imaging data, in long format. Can be estimated using the function |
model_name |
character string to give the fit model in the functional cox slot |
formula |
Formula to be fed to mgcv in the form of survival_time ~ x1 + x2. Does not contain functional predictor. Character valued. Data must contain censoring variable called "event". |
event |
character string for the column in Metadata that contains 1/0 for the survival event |
metric |
name of calculated spatial metric to use |
r |
Character string, the name of the variable that identifies the function domain (usually a radius for spatial summary functions). Default is "r". |
value |
Character string, the name of the variable that identifies the spatial summary function values. Default is "fundiff". |
afcm |
If TRUE, runs additive functional Cox model. If FALSE, runs linear functional cox model. Defaults to linear functional cox model. |
filter_cols |
a named vector of factors to filter summary functions to in |
pve |
Proportion of variance explained by multilevel functional principal components analysis in mfpca step |
... |
Optional other arguments to be passed to |
knots |
Number of knots for defining spline basis. |
A list
which is a linear or additive functional Cox model fit. See mgcv::gam
for more details.
Julia Wrobel [email protected]
Alex Soupir [email protected]
#load ovarian mxFDA object data('lung_FDA') # run the lfcm model lung_FDA = run_mfcm(lung_FDA, model_name = "fit_mlfcm", formula = survival_days ~ age, event = "survival_status", metric = "uni g", r = "r", value = "fundiff", pve = 0.99, afcm = FALSE)
#load ovarian mxFDA object data('lung_FDA') # run the lfcm model lung_FDA = run_mfcm(lung_FDA, model_name = "fit_mlfcm", formula = survival_days ~ age, event = "survival_status", metric = "uni g", r = "r", value = "fundiff", pve = 0.99, afcm = FALSE)
This is a wrapper for the function mfpca.face
from the refund
package. EXPAND
run_mfpca( mxFDAobject, metric = "uni k", r = "r", value = "fundiff", knots = NULL, lightweight = FALSE, ... )
run_mfpca( mxFDAobject, metric = "uni k", r = "r", value = "fundiff", knots = NULL, lightweight = FALSE, ... )
mxFDAobject |
object of class |
metric |
name of calculated spatial metric to use |
r |
Character string, the name of the variable that identifies the function domain (usually a radius for spatial summary functions). Default is "r". |
value |
Character string, the name of the variable that identifies the spatial summary function values. Default is "fundiff". |
knots |
Number of knots for defining spline basis.Defaults to the number of measurements per function divided by 2. |
lightweight |
Default is FALSE. If TRUE, removes Y and Yhat from returned mFPCA object. A good option to select for large datasets. |
... |
Optional other arguments to be passed to |
A mxFDA
object with the functional_mpca
slot for the respective spatial summary function containing:
mxfundata |
The original dataframe of spatial summary functions, with scores from FPCA appended for downstream modeling |
fpc_object |
A list of class "fpca" with elements described in the documentation for |
unknown [email protected]
Julia Wrobel [email protected]
Alex Soupir [email protected]
Xiao, L., Ruppert, D., Zipunnikov, V., and Crainiceanu, C. (2016). Fast covariance estimation for high-dimensional functional data. Statistics and Computing, 26, 409-421. DOI: 10.1007/s11222-014-9485-x.
#load data data(lung_FDA) #run mixed fpca lung_FDA = run_mfpca(lung_FDA, metric = 'uni g')
#load data data(lung_FDA) #run mixed fpca lung_FDA = run_mfpca(lung_FDA, metric = 'uni g')
Fit a scalar-on-function regression model. Uses refund::pfr under the hood for computations, and stores results in the mxfda object.
run_sofr( mxFDAobject, model_name, formula, family = "gaussian", metric = "uni k", r = "r", value = "fundiff", smooth = FALSE, filter_cols = NULL, ..., knots = NULL )
run_sofr( mxFDAobject, model_name, formula, family = "gaussian", metric = "uni k", r = "r", value = "fundiff", smooth = FALSE, filter_cols = NULL, ..., knots = NULL )
mxFDAobject |
Dataframe of spatial summary functions from multiplex imaging data, in long format. Can be estimated using the function |
model_name |
character string to give the fit model |
formula |
Formula to be fed to mgcv in the form of outcome ~ x1 + x2. Does not contain functional predictor. Character valued. |
family |
Exponential family distribution to be passed to |
metric |
Name of calculated spatial metric to use |
r |
Character string, the name of the variable that identifies the function domain (usually a radius for spatial summary functions). Default is "r". |
value |
Character string, the name of the variable that identifies the spatial summary function values. Default is "fundiff". |
smooth |
Option to smooth data using FPCA. Defaults to FALSE. |
filter_cols |
a named vector of factors to filter summary functions to in |
... |
Optional other arguments to be passed to |
knots |
Number of knots for defining spline basis. |
A list
which is a linear or additive functional Cox model fit. See mgcv::gam
for more details.
Julia Wrobel [email protected]
Alex Soupir [email protected]
#load ovarian mxFDA object data('ovarian_FDA') # run scalar on function regression model with a continuous outcome (age) ovarian_FDA = run_sofr(ovarian_FDA, model_name = "fit_sofr", formula = age~stage, metric = "uni g", r = "r", value = "fundiff") # run scalar on function regression model with a binary outcome (stage) # also known as functional logistic regression ovarian_FDA = run_sofr(ovarian_FDA, model_name = "fit_sofr", formula = stage~age, family = "binomial", metric = "uni g", r = "r", value = "fundiff")
#load ovarian mxFDA object data('ovarian_FDA') # run scalar on function regression model with a continuous outcome (age) ovarian_FDA = run_sofr(ovarian_FDA, model_name = "fit_sofr", formula = age~stage, metric = "uni g", r = "r", value = "fundiff") # run scalar on function regression model with a binary outcome (stage) # also known as functional logistic regression ovarian_FDA = run_sofr(ovarian_FDA, model_name = "fit_sofr", formula = stage~age, family = "binomial", metric = "uni g", r = "r", value = "fundiff")
mxFDA
Summary method for object of class mxFDA
## S3 method for class 'mxFDA' summary(object, ...)
## S3 method for class 'mxFDA' summary(object, ...)
object |
object of class |
... |
unused currently |
summary of object to the R console
Alex Soupir [email protected]
Internal function called by extract_summary_functions()
to calculate a univariate spatial summary function for a single image.
univariate( mximg, markvar, mark1, mark2, r_vec, func = c(Kest, Lest, Gest), edge_correction, empirical_CSR = FALSE, permutations = 1000 )
univariate( mximg, markvar, mark1, mark2, r_vec, func = c(Kest, Lest, Gest), edge_correction, empirical_CSR = FALSE, permutations = 1000 )
mximg |
Dataframe of cell-level multiplex imaging data for a single image.
Should have variables |
markvar |
The name of the variable that denotes cell type(s) of interest. Character. |
mark1 |
dummy filler, unused |
mark2 |
dummy filler, unused |
r_vec |
Numeric vector of radii over which to evaluate spatial summary functions. Must begin at 0. |
func |
Spatial summary function to calculate. Options are c(Kest, Lest, Gest) which denote Ripley's K, Besag's L, and nearest neighbor G function, respectively. |
edge_correction |
Character string that denotes the edge correction method for spatial summary function. For Kest and Lest choose one of c("border", "isotropic", "Ripley", "translate", "none"). For Gest choose one of c("rs", "km", "han") |
empirical_CSR |
logical to indicate whether to use the permutations to identify the sample-specific complete spatial randomness (CSR) estimation. |
permutations |
integer for the number of permtuations to use if empirical_CSR is |
A data.frame
containing:
r |
the radius of values over which the spatial summary function is evaluated |
sumfun |
the values of the spatial summary function |
csr |
the values of the spatial summary function under complete spatial randomness |
fundiff |
sumfun - csr, positive values indicate clustering and negative values repulsion |
Julia Wrobel [email protected]
Alex Soupir [email protected]
Creed, J. H., Wilson, C. M., Soupir, A. C., Colin-Leitzinger, C. M., Kimmel, G. J., Ospina, O. E., Chakiryan, N. H., Markowitz, J., Peres, L. C., Coghill, A., & Fridley, B. L. (2021). spatialTIME and iTIME: R package and Shiny application for visualization and analysis of immunofluorescence data. Bioinformatics (Oxford, England), 37(23), 4584–4586. https://doi.org/10.1093/bioinformatics/btab757