| Title: | Visualization and Analysis of Spatial Heterogeneity in Spatially-Resolved Gene Expression |
|---|---|
| Description: | Visualization and analysis of spatially resolved transcriptomics data. The 'spatialGE' R package provides methods for visualizing and analyzing spatially resolved transcriptomics data, such as 10X Visium, CosMx, or csv/tsv gene expression matrices. It includes tools for spatial interpolation, autocorrelation analysis, tissue domain detection, gene set enrichment, and differential expression analysis using spatial mixed models. |
| Authors: | Oscar Ospina [aut, cre] (ORCID: <https://orcid.org/0000-0001-5986-4207>), Alex Soupir [aut] (ORCID: <https://orcid.org/0000-0003-1251-9179>), Brooke Fridley [aut] (ORCID: <https://orcid.org/0000-0001-7739-7956>), Satija Lab [cph] (Copyright holder of code fragments from Seurat function FindVariableFeatures) |
| Maintainer: | Oscar Ospina <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 1.2.2 |
| Built: | 2026-06-01 07:03:34 UTC |
| Source: | https://github.com/cran/spatialGE |
Plots the spatial autocorrelation statistics of genes across samples and colors samples acording to sample metadata.
compare_SThet( x = NULL, samplemeta = NULL, genes = NULL, color_by = NULL, categorical = TRUE, color_pal = "muted", ptsize = 1 )compare_SThet( x = NULL, samplemeta = NULL, genes = NULL, color_by = NULL, categorical = TRUE, color_pal = "muted", ptsize = 1 )
x |
an STlist. |
samplemeta |
a string indicating the name of the variable in the clinical data frame. If NULL, uses sample names |
genes |
the name(s) of the gene(s) to plot. |
color_by |
the variable in |
categorical |
logical indicating whether or not to treat |
color_pal |
a string of a color palette from khroma or RColorBrewer, or a vector with colors with enough elements to plot categories. |
ptsize |
a number specifying the size of the points. Passed to the |
This function takes the names of genes and their Moran's I or Geary's C computed for multiple samples and to provide a muti-sample comparison. Samples in the plot can be colored according to sample metadata to explore potential associations between spatial distribution of gene expression and sample-level data.
a list of plots
Returns the number of genes and spots for each array within an STList object
## S4 method for signature 'STlist' dim(x)## S4 method for signature 'STlist' dim(x)
x |
an STList object to show summary from. |
This function takes an STList and prints the number of genes (rows) and spots (columns) of each spatial array within that object.
Generates violin plots, boxplots, or density plots of variables in the spatial meta data or of gene expression
distribution_plots( x = NULL, plot_meta = NULL, genes = NULL, samples = NULL, data_type = "tr", color_pal = "okabeito", plot_type = "violin", ptsize = 0.5, ptalpha = 0.5 )distribution_plots( x = NULL, plot_meta = NULL, genes = NULL, samples = NULL, data_type = "tr", color_pal = "okabeito", plot_type = "violin", ptsize = 0.5, ptalpha = 0.5 )
x |
an STlist |
plot_meta |
vector of variables in |
genes |
vector of genes to plot expression distribution. If used in conjunction
with |
samples |
samples to include in the plot. Default (NULL) includes all samples |
data_type |
one of 'tr' or 'raw', to plot transformed or raw counts |
color_pal |
a string of a color palette from |
plot_type |
one of "violin", "box", or "density" (violin plots, box plots, or
density plots respectively). If |
ptsize |
the size of points in the plots |
ptalpha |
the transparency of points (violin/box plot) or curves (density plots) |
The function allows to visualize the distribution of spot/cell total counts, total genes, or expression of specific genes across all samples for comparative purposes. It also allows grouping of gene expression values by categorical variables (e.g., clusters).
a list containing ggplot2 objects
Filtering of spots/cells, genes or samples, as well as count-based filtering
filter_data( x = NULL, spot_minreads = 0, spot_maxreads = NULL, spot_mingenes = 0, spot_maxgenes = NULL, spot_minpct = 0, spot_maxpct = NULL, gene_minreads = 0, gene_maxreads = NULL, gene_minspots = 0, gene_maxspots = NULL, gene_minpct = 0, gene_maxpct = NULL, samples = NULL, rm_tissue = NULL, rm_spots = NULL, rm_genes = NULL, rm_genes_expr = NULL, spot_pct_expr = "^MT-" )filter_data( x = NULL, spot_minreads = 0, spot_maxreads = NULL, spot_mingenes = 0, spot_maxgenes = NULL, spot_minpct = 0, spot_maxpct = NULL, gene_minreads = 0, gene_maxreads = NULL, gene_minspots = 0, gene_maxspots = NULL, gene_minpct = 0, gene_maxpct = NULL, samples = NULL, rm_tissue = NULL, rm_spots = NULL, rm_genes = NULL, rm_genes_expr = NULL, spot_pct_expr = "^MT-" )
x |
an STlist |
spot_minreads |
the minimum number of total reads for a spot to be retained |
spot_maxreads |
the maximum number of total reads for a spot to be retained |
spot_mingenes |
the minimum number of non-zero counts for a spot to be retained |
spot_maxgenes |
the maximum number of non-zero counts for a spot to be retained |
spot_minpct |
the minimum percentage of counts for features defined by |
spot_maxpct |
the maximum percentage of counts for features defined by |
gene_minreads |
the minimum number of total reads for a gene to be retained |
gene_maxreads |
the maximum number of total reads for a gene to be retained |
gene_minspots |
he minimum number of spots with non-zero counts for a gene to be retained |
gene_maxspots |
the maximum number of spots with non-zero counts for a gene to be retained |
gene_minpct |
the minimum percentage of spots with non-zero counts for a gene to be retained |
gene_maxpct |
the maximum percentage of spots with non-zero counts for a gene to be retained |
samples |
samples (as in |
rm_tissue |
sample (as in |
rm_spots |
vector of spot/cell IDs to remove. Removes spots/cells in |
rm_genes |
vector of gene names to remove from STlist. Removes genes in |
rm_genes_expr |
a regular expression that matches genes to remove. Removes genes in |
spot_pct_expr |
a expression to use with |
This function provides options to filter elements in an STlist. It can remove
cells/spots or genes based on raw counts (x@counts). Users can input an
regular expression to query gene names and calculate percentages (for example %
mtDNA genes). The function also can filter entire samples. Note that the function
removes cells/spots, genes, and/or samples in the raw counts, transformed counts,
spatial variables, gene variables, and sample metadata. Also note that the function
filters in the following order:
Samples (rm_tissue)
Spots (rm_spots)
Genes (rm_genes)
Genes matching rm_genes_expr
Min and max counts
an STlist containing the filtered data
# Using included melanoma example (Thrane et al.) # Download example data set from spatialGE_Data thrane_tmp = tempdir() unlink(thrane_tmp, recursive=TRUE) dir.create(thrane_tmp) lk='https://github.com/FridleyLab/spatialGE_Data/raw/refs/heads/main/melanoma_thrane.zip?download=' tryCatch({ # In case data is not available from network download.file(lk, destfile=paste0(thrane_tmp, '/', 'melanoma_thrane.zip'), mode='wb') #' zip_tmp = list.files(thrane_tmp, pattern='melanoma_thrane.zip$', full.names=TRUE) unzip(zipfile=zip_tmp, exdir=thrane_tmp) # Generate the file paths to be passed to the STlist function count_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='counts') coord_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='mapping') clin_file <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='clinical') # Create STlist library('spatialGE') melanoma <- STlist(rnacounts=count_files, spotcoords=coord_files, samples=clin_file) melanoma <- filter_data(melanoma, spot_minreads=2000) }, error = function(e) { message("Could not run example. Are you connected to the internet?") return(NULL) })# Using included melanoma example (Thrane et al.) # Download example data set from spatialGE_Data thrane_tmp = tempdir() unlink(thrane_tmp, recursive=TRUE) dir.create(thrane_tmp) lk='https://github.com/FridleyLab/spatialGE_Data/raw/refs/heads/main/melanoma_thrane.zip?download=' tryCatch({ # In case data is not available from network download.file(lk, destfile=paste0(thrane_tmp, '/', 'melanoma_thrane.zip'), mode='wb') #' zip_tmp = list.files(thrane_tmp, pattern='melanoma_thrane.zip$', full.names=TRUE) unzip(zipfile=zip_tmp, exdir=thrane_tmp) # Generate the file paths to be passed to the STlist function count_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='counts') coord_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='mapping') clin_file <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='clinical') # Create STlist library('spatialGE') melanoma <- STlist(rnacounts=count_files, spotcoords=coord_files, samples=clin_file) melanoma <- filter_data(melanoma, spot_minreads=2000) }, error = function(e) { message("Could not run example. Are you connected to the internet?") return(NULL) })
Performs spatial interpolation ("kriging") of transformed gene counts
gene_interpolation( x = NULL, genes = "top", top_n = 10, samples = NULL, cores = NULL, verbose = TRUE )gene_interpolation( x = NULL, genes = "top", top_n = 10, samples = NULL, cores = NULL, verbose = TRUE )
x |
an STlist with transformed RNA counts |
genes |
a vector of gene names or 'top'. If 'top' (default), interpolation of
the 10 genes ( |
top_n |
an integer indicating how many top genes to perform interpolation. Default is 10. |
samples |
the spatial samples for which interpolations will be performed. If NULL (Default), all samples are interpolated. |
cores |
integer indicating the number of cores to use during parallelization.
If NULL, the function uses half of the available cores at a maximum. The parallelization
uses |
verbose |
either logical or an integer (0, 1, or 2) to increase verbosity. |
This function takes an STlist and a vector of gene names and generates spatial
interpolation of gene expression values via "kriging". If genes='top', then
the 10 genes (default) with the highest standard deviation for each ST sample
are interpolated. The resulting interpolations can be visualized via the
STplot_interpolation function
x a STlist including spatial interpolations.
# Using included melanoma example (Thrane et al.) # Download example data set from spatialGE_Data thrane_tmp = tempdir() unlink(thrane_tmp, recursive=TRUE) dir.create(thrane_tmp) lk='https://github.com/FridleyLab/spatialGE_Data/raw/refs/heads/main/melanoma_thrane.zip?download=' tryCatch({ # In case data is not available from network download.file(lk, destfile=paste0(thrane_tmp, '/', 'melanoma_thrane.zip'), mode='wb') #' zip_tmp = list.files(thrane_tmp, pattern='melanoma_thrane.zip$', full.names=TRUE) unzip(zipfile=zip_tmp, exdir=thrane_tmp) # Generate the file paths to be passed to the STlist function count_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='counts') coord_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='mapping') clin_file <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='clinical') # Create STlist library('spatialGE') melanoma <- STlist(rnacounts=count_files, spotcoords=coord_files, samples=clin_file) melanoma <- transform_data(melanoma) melanoma <- gene_interpolation(melanoma, genes=c('MLANA', 'COL1A1'), samples='ST_mel1_rep2') kp = STplot_interpolation(melanoma, genes=c('MLANA', 'COL1A1')) ggpubr::ggarrange(plotlist=kp) }, error = function(e) { message("Could not run example. Are you connected to the internet?") return(NULL) })# Using included melanoma example (Thrane et al.) # Download example data set from spatialGE_Data thrane_tmp = tempdir() unlink(thrane_tmp, recursive=TRUE) dir.create(thrane_tmp) lk='https://github.com/FridleyLab/spatialGE_Data/raw/refs/heads/main/melanoma_thrane.zip?download=' tryCatch({ # In case data is not available from network download.file(lk, destfile=paste0(thrane_tmp, '/', 'melanoma_thrane.zip'), mode='wb') #' zip_tmp = list.files(thrane_tmp, pattern='melanoma_thrane.zip$', full.names=TRUE) unzip(zipfile=zip_tmp, exdir=thrane_tmp) # Generate the file paths to be passed to the STlist function count_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='counts') coord_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='mapping') clin_file <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='clinical') # Create STlist library('spatialGE') melanoma <- STlist(rnacounts=count_files, spotcoords=coord_files, samples=clin_file) melanoma <- transform_data(melanoma) melanoma <- gene_interpolation(melanoma, genes=c('MLANA', 'COL1A1'), samples='ST_mel1_rep2') kp = STplot_interpolation(melanoma, genes=c('MLANA', 'COL1A1')) ggpubr::ggarrange(plotlist=kp) }, error = function(e) { message("Could not run example. Are you connected to the internet?") return(NULL) })
Extracts gene-level metadata and spatial statistics (if already computed)
get_gene_meta(x = NULL, sthet_only = FALSE)get_gene_meta(x = NULL, sthet_only = FALSE)
x |
an STlist |
sthet_only |
logical, return only genes with spatial statistics |
This function extracts data from the x@gene_meta slot, optionally subsetting
only to those genes for which spatial statistics (Moran's I or Geary's C, see SThet)
have been calculated. The output is a data frame with data from all samples in the
STlist
a data frame with gene-level data
# Using included melanoma example (Thrane et al.) # Download example data set from spatialGE_Data thrane_tmp = tempdir() unlink(thrane_tmp, recursive=TRUE) dir.create(thrane_tmp) lk='https://github.com/FridleyLab/spatialGE_Data/raw/refs/heads/main/melanoma_thrane.zip?download=' tryCatch({ # In case data is not available from network download.file(lk, destfile=paste0(thrane_tmp, '/', 'melanoma_thrane.zip'), mode='wb') #' zip_tmp = list.files(thrane_tmp, pattern='melanoma_thrane.zip$', full.names=TRUE) unzip(zipfile=zip_tmp, exdir=thrane_tmp) # Generate the file paths to be passed to the STlist function count_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='counts') coord_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='mapping') clin_file <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='clinical') # Create STlist library('spatialGE') melanoma <- STlist(rnacounts=count_files, spotcoords=coord_files, samples=clin_file) melanoma <- transform_data(melanoma) melanoma <- SThet(melanoma, genes=c('MLANA', 'TP53'), method='moran') get_gene_meta(melanoma, sthet_only=TRUE) }, error = function(e) { message("Could not run example. Are you connected to the internet?") return(NULL) })# Using included melanoma example (Thrane et al.) # Download example data set from spatialGE_Data thrane_tmp = tempdir() unlink(thrane_tmp, recursive=TRUE) dir.create(thrane_tmp) lk='https://github.com/FridleyLab/spatialGE_Data/raw/refs/heads/main/melanoma_thrane.zip?download=' tryCatch({ # In case data is not available from network download.file(lk, destfile=paste0(thrane_tmp, '/', 'melanoma_thrane.zip'), mode='wb') #' zip_tmp = list.files(thrane_tmp, pattern='melanoma_thrane.zip$', full.names=TRUE) unzip(zipfile=zip_tmp, exdir=thrane_tmp) # Generate the file paths to be passed to the STlist function count_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='counts') coord_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='mapping') clin_file <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='clinical') # Create STlist library('spatialGE') melanoma <- STlist(rnacounts=count_files, spotcoords=coord_files, samples=clin_file) melanoma <- transform_data(melanoma) melanoma <- SThet(melanoma, genes=c('MLANA', 'TP53'), method='moran') get_gene_meta(melanoma, sthet_only=TRUE) }, error = function(e) { message("Could not run example. Are you connected to the internet?") return(NULL) })
Loads the images from tissues to the appropriate STlist slot.
load_images(x = NULL, images = NULL)load_images(x = NULL, images = NULL)
x |
an STlist |
images |
a string indicating a folder to load images from |
This function looks for .PNG or .JPG files within a folder matching the
sample names in an existing STlist. Then, loads the images to the STlist which
can be used for plotting along with other spatialGE plots.
an STlist with images
Generates density plots, violin plots, and/or boxplots for the distribution of count values
plot_counts( x = NULL, samples = NULL, data_type = "tr", plot_type = "density", color_pal = "okabeito", cvalpha = 0.5, distrib_subset = 0.5, subset_seed = 12345 )plot_counts( x = NULL, samples = NULL, data_type = "tr", plot_type = "density", color_pal = "okabeito", cvalpha = 0.5, distrib_subset = 0.5, subset_seed = 12345 )
x |
an STlist |
samples |
samples to include in the plot. Default (NULL) includes all samples |
data_type |
one of |
plot_type |
one or several of |
color_pal |
a string of a color palette from |
cvalpha |
the transparency of the density plots |
distrib_subset |
the proportion of spots/cells to plot. Generating these plots can be time consuming due to the large amount of elements to plot. This argument provides control on how many randomly values to show to speed plotting |
subset_seed |
related to |
The function allows to visualize the distribution counts across all genes and spots
in the STlist. The user can select between density plots, violin plots, or box
plots as visualization options. Useful for assessment of the effect of filtering and
data transformations and to assess zero-inflation. To plot counts or genes per
spot/cell, the function distribution_plots should be used instead.
a list of ggplot objects
# Using included melanoma example (Thrane et al.) # Download example data set from spatialGE_Data thrane_tmp = tempdir() unlink(thrane_tmp, recursive=TRUE) dir.create(thrane_tmp) lk='https://github.com/FridleyLab/spatialGE_Data/raw/refs/heads/main/melanoma_thrane.zip?download=' tryCatch({ # In case data is not available from network download.file(lk, destfile=paste0(thrane_tmp, '/', 'melanoma_thrane.zip'), mode='wb') #' zip_tmp = list.files(thrane_tmp, pattern='melanoma_thrane.zip$', full.names=TRUE) unzip(zipfile=zip_tmp, exdir=thrane_tmp) # Generate the file paths to be passed to the STlist function count_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='counts') coord_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='mapping') clin_file <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='clinical') # Create STlist library('spatialGE') melanoma <- STlist(rnacounts=count_files, spotcoords=coord_files, samples=clin_file) cp <- plot_counts(melanoma, data_type='raw', plot_type=c('violin', 'box')) ggpubr::ggarrange(plotlist=cp) }, error = function(e) { message("Could not run example. Are you connected to the internet?") return(NULL) })# Using included melanoma example (Thrane et al.) # Download example data set from spatialGE_Data thrane_tmp = tempdir() unlink(thrane_tmp, recursive=TRUE) dir.create(thrane_tmp) lk='https://github.com/FridleyLab/spatialGE_Data/raw/refs/heads/main/melanoma_thrane.zip?download=' tryCatch({ # In case data is not available from network download.file(lk, destfile=paste0(thrane_tmp, '/', 'melanoma_thrane.zip'), mode='wb') #' zip_tmp = list.files(thrane_tmp, pattern='melanoma_thrane.zip$', full.names=TRUE) unzip(zipfile=zip_tmp, exdir=thrane_tmp) # Generate the file paths to be passed to the STlist function count_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='counts') coord_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='mapping') clin_file <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='clinical') # Create STlist library('spatialGE') melanoma <- STlist(rnacounts=count_files, spotcoords=coord_files, samples=clin_file) cp <- plot_counts(melanoma, data_type='raw', plot_type=c('violin', 'box')) ggpubr::ggarrange(plotlist=cp) }, error = function(e) { message("Could not run example. Are you connected to the internet?") return(NULL) })
Creates ggplot objects of the tissue images when available within the STlist
plot_image(x = NULL, samples = NULL)plot_image(x = NULL, samples = NULL)
x |
an STlist |
samples |
a vector of numbers indicating the ST samples to plot, or their
sample names. If vector of numbers, it follow the order of |
If the STlist contains tissue images in the @misc slot, the plot_image function
can be used to generate ggplot objects. These ggplot objects can be plotted next to
quilt plots (STplot function) for comparative analysis.
a list of plots
Generates a PCA plot after computation of "pseudobulk" counts
pseudobulk_dim_plot( x = NULL, color_pal = "muted", plot_meta = NULL, dim = "pca", pcx = 1, pcy = 2, ptsize = 5 )pseudobulk_dim_plot( x = NULL, color_pal = "muted", plot_meta = NULL, dim = "pca", pcx = 1, pcy = 2, ptsize = 5 )
x |
an STlist with pseudobulk PCA results in the |
color_pal |
a string of a color palette from khroma or RColorBrewer, or a
vector of color names or HEX values. Each color represents a category in the
variable specified in |
plot_meta |
a string indicating the name of the variable in the sample metadata to color points in the PCA plot |
dim |
one of |
pcx |
integer indicating the principal component to plot in the x axis |
pcy |
integer indicating the principal component to plot in the y axis |
ptsize |
the size of the points in the PCA plot. Passed to the |
Generates a Principal Components Analysis plot to help in initial data exploration of
differences among samples. The points in the plot represent "pseudobulk" samples.
This function follows after usage of pseudobulk_samples.
a ggplot object
# Using included melanoma example (Thrane et al.) # Download example data set from spatialGE_Data thrane_tmp = tempdir() unlink(thrane_tmp, recursive=TRUE) dir.create(thrane_tmp) lk='https://github.com/FridleyLab/spatialGE_Data/raw/refs/heads/main/melanoma_thrane.zip?download=' tryCatch({ # In case data is not available from network download.file(lk, destfile=paste0(thrane_tmp, '/', 'melanoma_thrane.zip'), mode='wb') #' zip_tmp = list.files(thrane_tmp, pattern='melanoma_thrane.zip$', full.names=TRUE) unzip(zipfile=zip_tmp, exdir=thrane_tmp) # Generate the file paths to be passed to the STlist function count_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='counts') coord_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='mapping') clin_file <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='clinical') # Create STlist library('spatialGE') melanoma <- STlist(rnacounts=count_files, spotcoords=coord_files, samples=clin_file) melanoma <- pseudobulk_samples(melanoma) pseudobulk_dim_plot(melanoma, plot_meta='patient') }, error = function(e) { message("Could not run example. Are you connected to the internet?") return(NULL) })# Using included melanoma example (Thrane et al.) # Download example data set from spatialGE_Data thrane_tmp = tempdir() unlink(thrane_tmp, recursive=TRUE) dir.create(thrane_tmp) lk='https://github.com/FridleyLab/spatialGE_Data/raw/refs/heads/main/melanoma_thrane.zip?download=' tryCatch({ # In case data is not available from network download.file(lk, destfile=paste0(thrane_tmp, '/', 'melanoma_thrane.zip'), mode='wb') #' zip_tmp = list.files(thrane_tmp, pattern='melanoma_thrane.zip$', full.names=TRUE) unzip(zipfile=zip_tmp, exdir=thrane_tmp) # Generate the file paths to be passed to the STlist function count_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='counts') coord_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='mapping') clin_file <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='clinical') # Create STlist library('spatialGE') melanoma <- STlist(rnacounts=count_files, spotcoords=coord_files, samples=clin_file) melanoma <- pseudobulk_samples(melanoma) pseudobulk_dim_plot(melanoma, plot_meta='patient') }, error = function(e) { message("Could not run example. Are you connected to the internet?") return(NULL) })
Generates a heatmap plot after computation of "pseudobulk" counts
pseudobulk_heatmap( x = NULL, color_pal = "muted", plot_meta = NULL, hm_display_genes = 30 )pseudobulk_heatmap( x = NULL, color_pal = "muted", plot_meta = NULL, hm_display_genes = 30 )
x |
an STlist with pseudobulk counts in the |
color_pal |
a string of a color palette from khroma or RColorBrewer, or a
vector of color names or HEX values. Each color represents a category in the
variable specified in |
plot_meta |
a string indicating the name of the variable in the sample metadata to annotate heatmap columns |
hm_display_genes |
number of genes to display in heatmap, selected based on decreasing order of standard deviation across samples |
Generates a heatmap of transformed "pseudobulk" counts to help in initial data
exploration of differences among samples. Each column in the heatmap represents a
"pseudobulk" sample. Rows are genes, with the number of genes displayed controlled by
the hm_display_genes argument. This function follows after usage of pseudobulk_samples.
a ggplot object
# Using included melanoma example (Thrane et al.) # Download example data set from spatialGE_Data thrane_tmp = tempdir() unlink(thrane_tmp, recursive=TRUE) dir.create(thrane_tmp) lk='https://github.com/FridleyLab/spatialGE_Data/raw/refs/heads/main/melanoma_thrane.zip?download=' tryCatch({ # In case data is not available from network download.file(lk, destfile=paste0(thrane_tmp, '/', 'melanoma_thrane.zip'), mode='wb') #' zip_tmp = list.files(thrane_tmp, pattern='melanoma_thrane.zip$', full.names=TRUE) unzip(zipfile=zip_tmp, exdir=thrane_tmp) # Generate the file paths to be passed to the STlist function count_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='counts') coord_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='mapping') clin_file <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='clinical') # Create STlist library('spatialGE') melanoma <- STlist(rnacounts=count_files, spotcoords=coord_files, samples=clin_file) melanoma <- pseudobulk_samples(melanoma) hm <- pseudobulk_heatmap(melanoma, plot_meta='BRAF_status', hm_display_genes=30) }, error = function(e) { message("Could not run example. Are you connected to the internet?") return(NULL) })# Using included melanoma example (Thrane et al.) # Download example data set from spatialGE_Data thrane_tmp = tempdir() unlink(thrane_tmp, recursive=TRUE) dir.create(thrane_tmp) lk='https://github.com/FridleyLab/spatialGE_Data/raw/refs/heads/main/melanoma_thrane.zip?download=' tryCatch({ # In case data is not available from network download.file(lk, destfile=paste0(thrane_tmp, '/', 'melanoma_thrane.zip'), mode='wb') #' zip_tmp = list.files(thrane_tmp, pattern='melanoma_thrane.zip$', full.names=TRUE) unzip(zipfile=zip_tmp, exdir=thrane_tmp) # Generate the file paths to be passed to the STlist function count_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='counts') coord_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='mapping') clin_file <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='clinical') # Create STlist library('spatialGE') melanoma <- STlist(rnacounts=count_files, spotcoords=coord_files, samples=clin_file) melanoma <- pseudobulk_samples(melanoma) hm <- pseudobulk_heatmap(melanoma, plot_meta='BRAF_status', hm_display_genes=30) }, error = function(e) { message("Could not run example. Are you connected to the internet?") return(NULL) })
Aggregates spot/cell counts into "pseudo bulk" samples for data exploration
pseudobulk_samples(x = NULL, max_var_genes = 5000, calc_umap = FALSE)pseudobulk_samples(x = NULL, max_var_genes = 5000, calc_umap = FALSE)
x |
an STlist. |
max_var_genes |
number of most variable genes (standard deviation) to use in pseudobulk analysis |
calc_umap |
logical, whether to calculate UMAP embeddings in addition to PCs |
This function takes an STlist and aggregates the spot/cell counts into "pseudo bulk" counts by summing all counts from all cell/spots for each gene. Then performs Principal Component Analysis (PCA) to explore non-spatial sample-to-sample variation
an STlist with appended pseudobulk counts and PCA coordinates
# Using included melanoma example (Thrane et al.) # Download example data set from spatialGE_Data thrane_tmp = tempdir() unlink(thrane_tmp, recursive=TRUE) dir.create(thrane_tmp) lk='https://github.com/FridleyLab/spatialGE_Data/raw/refs/heads/main/melanoma_thrane.zip?download=' tryCatch({ # In case data is not available from network download.file(lk, destfile=paste0(thrane_tmp, '/', 'melanoma_thrane.zip'), mode='wb') #' zip_tmp = list.files(thrane_tmp, pattern='melanoma_thrane.zip$', full.names=TRUE) unzip(zipfile=zip_tmp, exdir=thrane_tmp) # Generate the file paths to be passed to the STlist function count_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='counts') coord_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='mapping') clin_file <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='clinical') # Create STlist library('spatialGE') melanoma <- STlist(rnacounts=count_files, spotcoords=coord_files, samples=clin_file) melanoma <- pseudobulk_samples(melanoma) pseudobulk_dim_plot(melanoma) }, error = function(e) { message("Could not run example. Are you connected to the internet?") return(NULL) })# Using included melanoma example (Thrane et al.) # Download example data set from spatialGE_Data thrane_tmp = tempdir() unlink(thrane_tmp, recursive=TRUE) dir.create(thrane_tmp) lk='https://github.com/FridleyLab/spatialGE_Data/raw/refs/heads/main/melanoma_thrane.zip?download=' tryCatch({ # In case data is not available from network download.file(lk, destfile=paste0(thrane_tmp, '/', 'melanoma_thrane.zip'), mode='wb') #' zip_tmp = list.files(thrane_tmp, pattern='melanoma_thrane.zip$', full.names=TRUE) unzip(zipfile=zip_tmp, exdir=thrane_tmp) # Generate the file paths to be passed to the STlist function count_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='counts') coord_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='mapping') clin_file <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='clinical') # Create STlist library('spatialGE') melanoma <- STlist(rnacounts=count_files, spotcoords=coord_files, samples=clin_file) melanoma <- pseudobulk_samples(melanoma) pseudobulk_dim_plot(melanoma) }, error = function(e) { message("Could not run example. Are you connected to the internet?") return(NULL) })
Prints overview/summary of STList oject.
## S4 method for signature 'STlist' show(object)## S4 method for signature 'STlist' show(object)
object |
an STList object to show summary from. |
This function takes an STList and prints a the number of spatial arrays in that object and other information about the object.
returns the names of the annotations in the x@spatial_meta slot.
spatial_metadata(x)spatial_metadata(x)
x |
an STList object |
a list of character vectors containing the column names of x@spatial_meta
Perform unsupervised spatially-informed clustering on the spots/cells of a ST sample
STclust( x = NULL, samples = NULL, ws = 0.025, dist_metric = "euclidean", linkage = "ward.D2", ks = "dtc", topgenes = 2000, deepSplit = FALSE, cores = NULL, verbose = TRUE )STclust( x = NULL, samples = NULL, ws = 0.025, dist_metric = "euclidean", linkage = "ward.D2", ks = "dtc", topgenes = 2000, deepSplit = FALSE, cores = NULL, verbose = TRUE )
x |
an STlist with normalized expression data |
samples |
a vector with strings or a vector with integers indicating the samples to run STclust |
ws |
a double (0-1) indicating the weight to be applied to spatial distances. Defaults to 0.025 |
dist_metric |
the distance metric to be used. Defaults to 'euclidean'. Other
options are the same as in |
linkage |
the linkage method applied to hierarchical clustering. Passed to
|
ks |
the range of k values to assess. Defaults to |
topgenes |
the number of genes with highest spot-to-spot expression variation. The
variance is calculated via |
deepSplit |
a logical or integer (1-4), to be passed to |
cores |
an integer indicating the number of cores to use in parallelization (Unix only) |
verbose |
either logical or an integer (0, 1, or 2) to increase verbosity |
The function takes an STlist and calculates euclidean distances between cells or spots
based on the x,y spatial locations, and the expression of the top variable genes
(Seurat::FindVariableFeatures). The resulting distances are weighted by
applying 1-ws to the gene expression distances and ws to the spatial distances.
Hierarchical clustering is performed on the sum of the weighted distance matrices.
The STclust method allows for identification of tissue niches/domains that are
spatially cohesive.
an STlist with cluster assignments
# Using included melanoma example (Thrane et al.) # Download example data set from spatialGE_Data thrane_tmp = tempdir() unlink(thrane_tmp, recursive=TRUE) dir.create(thrane_tmp) lk='https://github.com/FridleyLab/spatialGE_Data/raw/refs/heads/main/melanoma_thrane.zip?download=' tryCatch({ # In case data is not available from network download.file(lk, destfile=paste0(thrane_tmp, '/', 'melanoma_thrane.zip'), mode='wb') #' zip_tmp = list.files(thrane_tmp, pattern='melanoma_thrane.zip$', full.names=TRUE) unzip(zipfile=zip_tmp, exdir=thrane_tmp) # Generate the file paths to be passed to the STlist function count_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='counts') coord_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='mapping') clin_file <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='clinical') # Create STlist library('spatialGE') melanoma <- STlist(rnacounts=count_files, spotcoords=coord_files, samples=clin_file) melanoma <- transform_data(melanoma) melanoma <- STclust(melanoma, ws=c(0, 0.025)) STplot(melanoma, ws=0.025, samples='ST_mel1_rep2', ptsize=1) }, error = function(e) { message("Could not run example. Are you connected to the internet?") return(NULL) })# Using included melanoma example (Thrane et al.) # Download example data set from spatialGE_Data thrane_tmp = tempdir() unlink(thrane_tmp, recursive=TRUE) dir.create(thrane_tmp) lk='https://github.com/FridleyLab/spatialGE_Data/raw/refs/heads/main/melanoma_thrane.zip?download=' tryCatch({ # In case data is not available from network download.file(lk, destfile=paste0(thrane_tmp, '/', 'melanoma_thrane.zip'), mode='wb') #' zip_tmp = list.files(thrane_tmp, pattern='melanoma_thrane.zip$', full.names=TRUE) unzip(zipfile=zip_tmp, exdir=thrane_tmp) # Generate the file paths to be passed to the STlist function count_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='counts') coord_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='mapping') clin_file <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='clinical') # Create STlist library('spatialGE') melanoma <- STlist(rnacounts=count_files, spotcoords=coord_files, samples=clin_file) melanoma <- transform_data(melanoma) melanoma <- STclust(melanoma, ws=c(0, 0.025)) STplot(melanoma, ws=0.025, samples='ST_mel1_rep2', ptsize=1) }, error = function(e) { message("Could not run example. Are you connected to the internet?") return(NULL) })
Tests for differentially expressed genes using linear models with or without spatial covariance structures
STdiff( x = NULL, samples = NULL, annot = NULL, w = NULL, k = NULL, deepSplit = NULL, topgenes = 5000, pval_thr = 0.05, pval_adj = "fdr", test_type = "mm", sp_topgenes = 0.2, clusters = NULL, pairwise = FALSE, verbose = 1L, cores = NULL )STdiff( x = NULL, samples = NULL, annot = NULL, w = NULL, k = NULL, deepSplit = NULL, topgenes = 5000, pval_thr = 0.05, pval_adj = "fdr", test_type = "mm", sp_topgenes = 0.2, clusters = NULL, pairwise = FALSE, verbose = 1L, cores = NULL )
x |
an STlist |
samples |
an integer indicating the spatial samples to be included in the DE tests.
Numbers follow the order in |
annot |
a column name in |
w |
the spatial weight used in STclust. Required if |
k |
the k value used in STclust, or |
deepSplit |
the deepSplit value if used in STclust. Required if |
topgenes |
an integer indicating the top variable genes to select from each sample based on variance (default=5000). If NULL, all genes are selected. |
pval_thr |
cut-off of adjusted p-values to define differentially expressed genes from
non-spatial linear models. A proportion of genes ( |
pval_adj |
Method to adjust p-values. Defaults to |
test_type |
one of |
sp_topgenes |
Proportion of differentially expressed genes from non-spatial
linear models (and controlled by |
clusters |
cluster name(s) to test DE genes, as opposed to all clusters. |
pairwise |
whether or not to carry tests on a pairwise manner. The default is
|
verbose |
either logical or an integer (0, 1, or 2) to increase verbosity |
cores |
Number of cores to use in parallelization. If |
The method tests for differentially expressed genes between groups of spots/cells
(e.g., clusters) in a spatial transcriptomics sample. Specifically, the function
tests for genes with significantly higher or lower gene expression in one group of
spots/cells with respect to the rest of spots/cells in the sample. The method first
runs non-spatial linear models on the genes to detect differentially expressed genes.
Then spatial linear models with exponential covariance structure are fit on a
subset of genes detected as differentially expressed by the non-linear models (sp_topgenes).
If running on clusters detected via STclust, the user can specify the assignments
using the same parameters (w, k, deepSplit). Otherwise, the assignments are
specified by indicating one of the column names in x@spatial_meta. The function
uses spaMM::fitme and is computationally expensive even on HPC environments.
To run the STdiff using the non-spatial approach (faster), set sp_topgenes=0.
a list with one data frame per sample with results of differential gene expression analysis
Generates volcano plots of differential expression results from STdiff
STdiff_volcano( x = NULL, samples = NULL, clusters = NULL, pval_thr = 0.05, color_pal = NULL )STdiff_volcano( x = NULL, samples = NULL, clusters = NULL, pval_thr = 0.05, color_pal = NULL )
x |
the output of |
samples |
samples to create plots |
clusters |
names of the clusters to generate comparisons |
pval_thr |
the p-value threshold to color genes with differential expression |
color_pal |
the palette to color genes by significance |
The function generated volcano plots (p-value vs. log-fold change) for
genes tested with STdiff. Colors can be customized to show significance from
spatial and non-spatial models
a list of ggplot objects
Test for spatial enrichment of gene expression sets in ST data sets
STenrich( x = NULL, samples = NULL, gene_sets = NULL, score_type = "avg", reps = 1000, annot = NULL, domain = NULL, num_sds = 1, min_units = 20, min_genes = 5, pval_adj_method = "BH", seed = 12345, cores = NULL, verbose = TRUE )STenrich( x = NULL, samples = NULL, gene_sets = NULL, score_type = "avg", reps = 1000, annot = NULL, domain = NULL, num_sds = 1, min_units = 20, min_genes = 5, pval_adj_method = "BH", seed = 12345, cores = NULL, verbose = TRUE )
x |
an STlist with transformed gene expression |
samples |
a vector with sample names or indexes to run analysis |
gene_sets |
a named list of gene sets to test. The names of the list should identify the gene sets to be tested |
score_type |
Controls how gene set expression is calculated. The options are the average expression among genes in a set ('avg'), or a GSEA score ('gsva'). The default is 'avg' |
reps |
the number of random samples to be extracted. Default is 1000 replicates |
annot |
name of the annotation within |
domain |
the domain to restrict the analysis. Must exist within the spot/cell
categories included in the selected annotation (i.e., |
num_sds |
the number of standard deviations to set the minimum gene set expression threshold. Default is one (1) standard deviation |
min_units |
Minimum number of spots with high expression of a pathway for that gene set to be considered in the analysis. Defaults to 20 spots or cells |
min_genes |
the minimum number of genes of a gene set present in the data set for that gene set to be included. Default is 5 genes |
pval_adj_method |
the method for multiple comparison adjustment of p-values.
Options are the same as that of |
seed |
the seed number for the selection of random samples. Default is 12345 |
cores |
the number of cores used during parallelization. If NULL (default), the number of cores is defined automatically |
verbose |
either logical or an integer (0, 1, or 2) to increase verbosity |
The function performs a randomization test to assess if the sum of
distances between cells/spots with high expression of a gene set is lower than
the sum of distances among randomly selected cells/spots. The cells/spots are
considered as having high gene set expression if the average expression of genes in a
set is higher than the average expression plus num_sds times the standard deviation.
Control over the size of regions with high expression is provided by setting the
minimum number of cells/spots (min_units). This method is a modification of
the method devised by Hunter et al. 2021 (zebrafish melanoma study).
a list of data frames with the results of the test
Calculates Spearman's coefficients to detect genes showing expression spatial gradients
STgradient( x = NULL, samples = NULL, topgenes = 2000, annot = NULL, ref = NULL, exclude = NULL, out_rm = FALSE, limit = NULL, distsumm = "min", min_nb = 3, robust = TRUE, nb_dist_thr = NULL, log_dist = FALSE, cores = NULL, verbose = TRUE )STgradient( x = NULL, samples = NULL, topgenes = 2000, annot = NULL, ref = NULL, exclude = NULL, out_rm = FALSE, limit = NULL, distsumm = "min", min_nb = 3, robust = TRUE, nb_dist_thr = NULL, log_dist = FALSE, cores = NULL, verbose = TRUE )
x |
an STlist with transformed gene expression |
samples |
the samples on which the test should be executed |
topgenes |
the number of high-variance genes to be tested. These genes are selected in descending order of variance as caclulated using Seurat's vst method |
annot |
the name of a column in |
ref |
one of the tissue domains in the column specified in |
exclude |
optional, a cluster/domain to exclude from the analysis |
out_rm |
logical (optional), remove gene expression outliers defined by
the interquartile method. This option is only valid when |
limit |
limite the analysis to spots/cells with distances to |
distsumm |
the distance summary metric to use in correlations. One of |
min_nb |
the minimum number of immediate neighbors a spot or cell has to
have in order to be included in the analysis. This parameter seeks to reduce the
effect of isolated |
robust |
logical, whether to use robust regression ( |
nb_dist_thr |
a numeric vector of length two indicating the tolerance interval to assign
spots/cells to neighborhoods. The wider the range of the interval, the more likely
distinct neighbors to be considered. If NULL, |
log_dist |
logical, whether to apply the natural logarithm to the spot/cell distances. It applies to all distances a constant (1e-200) to avoid log(0) |
cores |
the number of cores used during parallelization. If NULL (default), the number of cores is defined automatically |
verbose |
logical, whether to print text to console |
The STgradient function fits linear models and calculates Spearman coefficients
between the expression of a gene and the minimum or average distance of spots or
cells to a reference tissue domain. In other wordsm the STgradient function
can be used to investigate if a gene is expressed higher in spots/cells closer to
a specific reference tissue domain, compared to spots/cells farther from the
reference domain (or viceversa as indicated by the Spearman's cofficient).
a list of data frames with the results of the test
Computes the global spatial autocorrelation statistics Moran's I and/or Geary's C for a set of genes
SThet( x = NULL, genes = NULL, samples = NULL, method = "moran", k = NULL, overwrite = TRUE, cores = NULL, verbose = TRUE )SThet( x = NULL, genes = NULL, samples = NULL, method = "moran", k = NULL, overwrite = TRUE, cores = NULL, verbose = TRUE )
x |
an STlist |
genes |
a vector of gene names to compute statistics |
samples |
the samples to compute statistics |
method |
The spatial statistic(s) to estimate. It can be set to 'moran', 'geary' or both. Default is 'moran' |
k |
the number of neighbors to estimate weights. By default NULL, meaning that spatial weights will be estimated from Euclidean distances. If an positive integer is entered, then the faster k nearest-neighbors approach is used. Please keep in mind that estimates are not as accurate as when using the default distance-based method. |
overwrite |
logical indicating if previous statistics should be overwritten. Default to FALSE (do not overwrite) |
cores |
integer indicating the number of cores to use during parallelization.
If NULL, the function uses half of the available cores at a maximum. The parallelization
uses |
verbose |
logical, whether to print text to console |
The function computes global spatial autocorrelation statistics (Moran's I and/or
Geary's C) for the requested genes and samples. Then computation uses the
package spdep. The calculated statistics are stored in the STlist, which can
be accessed with the get_gene_meta function. For visual comparative analysis,
the function compare_SThet can be used afterwards.
an STlist containing spatial statistics
# Using included melanoma example (Thrane et al.) # Download example data set from spatialGE_Data thrane_tmp = tempdir() unlink(thrane_tmp, recursive=TRUE) dir.create(thrane_tmp) lk='https://github.com/FridleyLab/spatialGE_Data/raw/refs/heads/main/melanoma_thrane.zip?download=' tryCatch({ # In case data is not available from network download.file(lk, destfile=paste0(thrane_tmp, '/', 'melanoma_thrane.zip'), mode='wb') #' zip_tmp = list.files(thrane_tmp, pattern='melanoma_thrane.zip$', full.names=TRUE) unzip(zipfile=zip_tmp, exdir=thrane_tmp) # Generate the file paths to be passed to the STlist function count_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='counts') coord_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='mapping') clin_file <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='clinical') # Create STlist library('spatialGE') melanoma <- STlist(rnacounts=count_files, spotcoords=coord_files, samples=clin_file) melanoma <- transform_data(melanoma) melanoma <- SThet(melanoma, genes=c('MLANA', 'TP53'), method='moran') get_gene_meta(melanoma, sthet_only=TRUE) }, error = function(e) { message("Could not run example. Are you connected to the internet?") return(NULL) })# Using included melanoma example (Thrane et al.) # Download example data set from spatialGE_Data thrane_tmp = tempdir() unlink(thrane_tmp, recursive=TRUE) dir.create(thrane_tmp) lk='https://github.com/FridleyLab/spatialGE_Data/raw/refs/heads/main/melanoma_thrane.zip?download=' tryCatch({ # In case data is not available from network download.file(lk, destfile=paste0(thrane_tmp, '/', 'melanoma_thrane.zip'), mode='wb') #' zip_tmp = list.files(thrane_tmp, pattern='melanoma_thrane.zip$', full.names=TRUE) unzip(zipfile=zip_tmp, exdir=thrane_tmp) # Generate the file paths to be passed to the STlist function count_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='counts') coord_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='mapping') clin_file <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='clinical') # Create STlist library('spatialGE') melanoma <- STlist(rnacounts=count_files, spotcoords=coord_files, samples=clin_file) melanoma <- transform_data(melanoma) melanoma <- SThet(melanoma, genes=c('MLANA', 'TP53'), method='moran') get_gene_meta(melanoma, sthet_only=TRUE) }, error = function(e) { message("Could not run example. Are you connected to the internet?") return(NULL) })
Creates an STlist object from one or multiple spatial transcriptomic samples.
STlist( rnacounts = NULL, spotcoords = NULL, samples = NULL, cores = NULL, verbose = TRUE )STlist( rnacounts = NULL, spotcoords = NULL, samples = NULL, cores = NULL, verbose = TRUE )
rnacounts |
the count data which can be provided in one of these formats:
|
spotcoords |
the cell/spot coordinates. Not required if inputs are Visium or Xenium (spaceranger or xeniumranger outputs).
|
samples |
the sample names/IDs and (optionally) metadata associated with
each spatial sample.
The following options are available for
|
cores |
integer indicating the number of cores to use during parallelization.
If NULL, the function uses half of the available cores at a maximum. The parallelization
uses |
verbose |
logical, whether to print text to console |
Objects of the S4 class STlist are the starting point of analyses in spatialGE.
The STlist contains data from one or multiple samples (i.e., tissue slices), and
results from most spatialGE's functions are stored within the object.
Raw gene counts and spatial coordinates. Gene count data have genes in rows and sampling units (e.g., cells, spots) in columns. Spatial coordinates have sampling units in rows and three columns: sample unit IDs, Y position, and X position.
Visium outputs from Space Ranger. The Visium directory must have the directory
structure resulting from spaceranger count, with either a count matrix represented in
MEX files or a h5 file. The directory should also contain a spatial sub-directory,
with the spatial coordinates (tissue_positions_list.csv), and
optionally the high resolution tissue image and scaling factor file scalefactors_json.json.
Xenium outputs from Xenium Ranger. The Xenium directory must have the directory
structure resulting from the xeniumranger pipeline, with either a cell-feature matrix
represented in MEX files or a h5 file. The directory should also contain a parquet file,
with the spatial coordinates (cells.parquet).
CosMx-SMI outputs. Two files are required to process SMI outputs: The exprMat and
metadata files. Both files must contain the "fov" and "cell_ID" columns. In addition,
the metadata files must contain the "CenterX_local_px" and "CenterY_local_px" columns.
Seurat object (V4). A Seurat V4 object produced via Seurat::Load10X_Spatial.
Optionally, the user can input a path to a file containing a table of sample-level metadata (e.g., clinical outcomes, tissue type, age). This sample metadata file should contain sample IDs in the first column partially matching the file names of the count/coordinate file paths or Visium directories. Note: The sample ID of a given sample cannot be a substring of the sample ID of another sample. For example, instead of using "tissue1" and "tissue12", use "tissue01" and "tissue12".
The function uses parallelization if run in a Unix system. Windows users will experience longer times depending on the number of samples.
an STlist object containing the counts and coordinates, and optionally
the sample metadata, which can be used for downstream analysis with spatialGE
# Using included melanoma example (Thrane et al.) # Download example data set from spatialGE_Data thrane_tmp = tempdir() unlink(thrane_tmp, recursive=TRUE) dir.create(thrane_tmp) lk='https://github.com/FridleyLab/spatialGE_Data/raw/refs/heads/main/melanoma_thrane.zip?download=' tryCatch({ # In case data is not available from network download.file(lk, destfile=paste0(thrane_tmp, '/', 'melanoma_thrane.zip'), mode='wb') #' zip_tmp = list.files(thrane_tmp, pattern='melanoma_thrane.zip$', full.names=TRUE) unzip(zipfile=zip_tmp, exdir=thrane_tmp) # Generate the file paths to be passed to the STlist function count_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='counts') coord_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='mapping') clin_file <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='clinical') # Create STlist library('spatialGE') melanoma <- STlist(rnacounts=count_files, spotcoords=coord_files, samples=clin_file) melanoma }, error = function(e) { message("Could not run example. Are you connected to the internet?") return(NULL) })# Using included melanoma example (Thrane et al.) # Download example data set from spatialGE_Data thrane_tmp = tempdir() unlink(thrane_tmp, recursive=TRUE) dir.create(thrane_tmp) lk='https://github.com/FridleyLab/spatialGE_Data/raw/refs/heads/main/melanoma_thrane.zip?download=' tryCatch({ # In case data is not available from network download.file(lk, destfile=paste0(thrane_tmp, '/', 'melanoma_thrane.zip'), mode='wb') #' zip_tmp = list.files(thrane_tmp, pattern='melanoma_thrane.zip$', full.names=TRUE) unzip(zipfile=zip_tmp, exdir=thrane_tmp) # Generate the file paths to be passed to the STlist function count_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='counts') coord_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='mapping') clin_file <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='clinical') # Create STlist library('spatialGE') melanoma <- STlist(rnacounts=count_files, spotcoords=coord_files, samples=clin_file) melanoma }, error = function(e) { message("Could not run example. Are you connected to the internet?") return(NULL) })
Definition of an STlist object class.
countsper spot RNA counts
spatial_metaper spot x,y coordinates
gene_metaper gene statistics (e.g., average expression, variance, Moran's I)
sample_metadataframe with metadata per sample
tr_countstransfromed per spot counts
gene_krigeresults from kriging on gene expression
miscParameters and images from ST data
Generates a plot of the location of spots/cells within an spatial sample, and colors them according to gene expression levels or spot/cell-level metadata
STplot( x, samples = NULL, genes = NULL, plot_meta = NULL, ks = "dtc", ws = NULL, deepSplit = NULL, color_pal = NULL, data_type = "tr", ptsize = NULL, txsize = NULL )STplot( x, samples = NULL, genes = NULL, plot_meta = NULL, ks = "dtc", ws = NULL, deepSplit = NULL, color_pal = NULL, data_type = "tr", ptsize = NULL, txsize = NULL )
x |
an STlist |
samples |
a vector of numbers indicating the ST samples to plot, or their
sample names. If vector of numbers, it follow the order of samples in |
genes |
a vector of gene names or a named list of gene sets. In the latter case, the averaged expression of genes within the sets is plotted |
plot_meta |
a column name in |
ks |
the k values to plot or 'dtc' to plot results from |
ws |
the spatial weights to plot samples if |
deepSplit |
a logical or positive number indicating the |
color_pal |
a string of a color palette from |
data_type |
one of 'tr' or 'raw', to plot transformed or raw counts respectively |
ptsize |
a number specifying the size of the points. Passed to the |
txsize |
a number controlling the size of the text in the plot title and legend title. Passed to the |
The function takes an STlist and plots the cells or spots in their spatial context.
The users can color the spots/cells according to the expression of selected genes,
cluster memberships, or any spot/cell level metadata included in x@spatial_meta.
The function also can average expression of gene sets.
a list of plots
# Using included melanoma example (Thrane et al.) # Download example data set from spatialGE_Data thrane_tmp = tempdir() unlink(thrane_tmp, recursive=TRUE) dir.create(thrane_tmp) lk='https://github.com/FridleyLab/spatialGE_Data/raw/refs/heads/main/melanoma_thrane.zip?download=' tryCatch({ # In case data is not available from network download.file(lk, destfile=paste0(thrane_tmp, '/', 'melanoma_thrane.zip'), mode='wb') #' zip_tmp = list.files(thrane_tmp, pattern='melanoma_thrane.zip$', full.names=TRUE) unzip(zipfile=zip_tmp, exdir=thrane_tmp) # Generate the file paths to be passed to the STlist function count_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='counts') coord_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='mapping') clin_file <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='clinical') # Create STlist library('spatialGE') melanoma <- STlist(rnacounts=count_files, spotcoords=coord_files, samples=clin_file) melanoma <- transform_data(melanoma) STplot(melanoma, gene='MLANA', samples='ST_mel1_rep2', ptsize=1) }, error = function(e) { message("Could not run example. Are you connected to the internet?") return(NULL) })# Using included melanoma example (Thrane et al.) # Download example data set from spatialGE_Data thrane_tmp = tempdir() unlink(thrane_tmp, recursive=TRUE) dir.create(thrane_tmp) lk='https://github.com/FridleyLab/spatialGE_Data/raw/refs/heads/main/melanoma_thrane.zip?download=' tryCatch({ # In case data is not available from network download.file(lk, destfile=paste0(thrane_tmp, '/', 'melanoma_thrane.zip'), mode='wb') #' zip_tmp = list.files(thrane_tmp, pattern='melanoma_thrane.zip$', full.names=TRUE) unzip(zipfile=zip_tmp, exdir=thrane_tmp) # Generate the file paths to be passed to the STlist function count_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='counts') coord_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='mapping') clin_file <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='clinical') # Create STlist library('spatialGE') melanoma <- STlist(rnacounts=count_files, spotcoords=coord_files, samples=clin_file) melanoma <- transform_data(melanoma) STplot(melanoma, gene='MLANA', samples='ST_mel1_rep2', ptsize=1) }, error = function(e) { message("Could not run example. Are you connected to the internet?") return(NULL) })
Produces a gene expression surface from kriging interpolation of ST data.
STplot_interpolation( x = NULL, genes = NULL, top_n = 10, samples = NULL, color_pal = "BuRd" )STplot_interpolation( x = NULL, genes = NULL, top_n = 10, samples = NULL, color_pal = "BuRd" )
x |
an STlist containing results from |
genes |
a vector of gene names (one or several) to plot. If 'top', the 10 genes with highest standard deviation from each spatial sample are plotted. |
top_n |
an integer indicating how many top genes to perform kriging. Default is 10. |
samples |
a vector indicating the spatial samples to plot. If vector of numbers,
it follows the order of |
color_pal |
a color scheme from |
This function produces a gene expression surface plot via kriging for one or several genes and spatial samples
a list of plots
# Using included melanoma example (Thrane et al.) # Download example data set from spatialGE_Data thrane_tmp = tempdir() unlink(thrane_tmp, recursive=TRUE) dir.create(thrane_tmp) lk='https://github.com/FridleyLab/spatialGE_Data/raw/refs/heads/main/melanoma_thrane.zip?download=' tryCatch({ # In case data is not available from network download.file(lk, destfile=paste0(thrane_tmp, '/', 'melanoma_thrane.zip'), mode='wb') #' zip_tmp = list.files(thrane_tmp, pattern='melanoma_thrane.zip$', full.names=TRUE) unzip(zipfile=zip_tmp, exdir=thrane_tmp) # Generate the file paths to be passed to the STlist function count_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='counts') coord_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='mapping') clin_file <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='clinical') # Create STlist library('spatialGE') melanoma <- STlist(rnacounts=count_files, spotcoords=coord_files, samples=clin_file) melanoma <- transform_data(melanoma) melanoma <- gene_interpolation(melanoma, genes=c('MLANA', 'COL1A1'), samples='ST_mel1_rep2') kp = STplot_interpolation(melanoma, genes=c('MLANA', 'COL1A1'), samples='ST_mel1_rep2') ggpubr::ggarrange(plotlist=kp) }, error = function(e) { message("Could not run example. Are you connected to the internet?") return(NULL) })# Using included melanoma example (Thrane et al.) # Download example data set from spatialGE_Data thrane_tmp = tempdir() unlink(thrane_tmp, recursive=TRUE) dir.create(thrane_tmp) lk='https://github.com/FridleyLab/spatialGE_Data/raw/refs/heads/main/melanoma_thrane.zip?download=' tryCatch({ # In case data is not available from network download.file(lk, destfile=paste0(thrane_tmp, '/', 'melanoma_thrane.zip'), mode='wb') #' zip_tmp = list.files(thrane_tmp, pattern='melanoma_thrane.zip$', full.names=TRUE) unzip(zipfile=zip_tmp, exdir=thrane_tmp) # Generate the file paths to be passed to the STlist function count_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='counts') coord_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='mapping') clin_file <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='clinical') # Create STlist library('spatialGE') melanoma <- STlist(rnacounts=count_files, spotcoords=coord_files, samples=clin_file) melanoma <- transform_data(melanoma) melanoma <- gene_interpolation(melanoma, genes=c('MLANA', 'COL1A1'), samples='ST_mel1_rep2') kp = STplot_interpolation(melanoma, genes=c('MLANA', 'COL1A1'), samples='ST_mel1_rep2') ggpubr::ggarrange(plotlist=kp) }, error = function(e) { message("Could not run example. Are you connected to the internet?") return(NULL) })
Produces a data frame with counts per gene and counts per ROI/spot/cell
summarize_STlist(x = NULL)summarize_STlist(x = NULL)
x |
an STlist |
The function creates a table with counts per gene and counts per region of interest (ROI), spot, or cell in the samples stored in the STlist
a data frame
# Using included melanoma example (Thrane et al.) # Download example data set from spatialGE_Data thrane_tmp = tempdir() unlink(thrane_tmp, recursive=TRUE) dir.create(thrane_tmp) lk='https://github.com/FridleyLab/spatialGE_Data/raw/refs/heads/main/melanoma_thrane.zip?download=' tryCatch({ # In case data is not available from network download.file(lk, destfile=paste0(thrane_tmp, '/', 'melanoma_thrane.zip'), mode='wb') #' zip_tmp = list.files(thrane_tmp, pattern='melanoma_thrane.zip$', full.names=TRUE) unzip(zipfile=zip_tmp, exdir=thrane_tmp) # Generate the file paths to be passed to the STlist function count_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='counts') coord_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='mapping') clin_file <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='clinical') # Create STlist library('spatialGE') melanoma <- STlist(rnacounts=count_files, spotcoords=coord_files, samples=clin_file) # Only first two samples summarize_STlist(melanoma) }, error = function(e) { message("Could not run example. Are you connected to the internet?.") return(NULL) })# Using included melanoma example (Thrane et al.) # Download example data set from spatialGE_Data thrane_tmp = tempdir() unlink(thrane_tmp, recursive=TRUE) dir.create(thrane_tmp) lk='https://github.com/FridleyLab/spatialGE_Data/raw/refs/heads/main/melanoma_thrane.zip?download=' tryCatch({ # In case data is not available from network download.file(lk, destfile=paste0(thrane_tmp, '/', 'melanoma_thrane.zip'), mode='wb') #' zip_tmp = list.files(thrane_tmp, pattern='melanoma_thrane.zip$', full.names=TRUE) unzip(zipfile=zip_tmp, exdir=thrane_tmp) # Generate the file paths to be passed to the STlist function count_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='counts') coord_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='mapping') clin_file <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='clinical') # Create STlist library('spatialGE') melanoma <- STlist(rnacounts=count_files, spotcoords=coord_files, samples=clin_file) # Only first two samples summarize_STlist(melanoma) }, error = function(e) { message("Could not run example. Are you connected to the internet?.") return(NULL) })
Prints overview/summary of STList oject.
## S4 method for signature 'STlist' summary(object)## S4 method for signature 'STlist' summary(object)
object |
an STList object to show summary from. |
This function takes an STList and prints a the number of spatial arrays in that object and other information about the object.
returns a character vector with the names of tissue samples in the STlist.
tissue_names(x)tissue_names(x)
x |
an STList object |
a character vector with the sample names in the STlist object
Applies data transformation methods to spatial transcriptomics samples within an STlist
transform_data( x = NULL, method = "log", scale_f = 10000, sct_n_regr_genes = 3000, sct_min_cells = 5, cores = NULL )transform_data( x = NULL, method = "log", scale_f = 10000, sct_n_regr_genes = 3000, sct_min_cells = 5, cores = NULL )
x |
an STlist with raw count matrices. |
method |
one of |
scale_f |
the scale factor used in logarithmic transformation |
sct_n_regr_genes |
the number of genes to be used in the regression model
during SCTransform. The function |
sct_min_cells |
The minimum number of spots/cells to be used in the regression
model fit by |
cores |
integer indicating the number of cores to use during parallelization.
If NULL, the function uses half of the available cores at a maximum. The parallelization
uses |
This function takes an STlist with raw counts and performs data transformation.
The user has the option to select between log transformation after library size
normalization (method='log'), or SCTransform (method='sct'). In the case of
logarithmic transformation, a scaling factor (10^4 by default) is applied. The
function uses parallelization using "forking" (not available in Windows OS).
Note that the method sct returns a matrix with less genes as filtering is
done for low expression genes.
x an updated STlist with transformed counts.
# Using included melanoma example (Thrane et al.) # Download example data set from spatialGE_Data thrane_tmp = tempdir() unlink(thrane_tmp, recursive=TRUE) dir.create(thrane_tmp) lk='https://github.com/FridleyLab/spatialGE_Data/raw/refs/heads/main/melanoma_thrane.zip?download=' tryCatch({ # In case data is not available from network download.file(lk, destfile=paste0(thrane_tmp, '/', 'melanoma_thrane.zip'), mode='wb') #' zip_tmp = list.files(thrane_tmp, pattern='melanoma_thrane.zip$', full.names=TRUE) unzip(zipfile=zip_tmp, exdir=thrane_tmp) # Generate the file paths to be passed to the STlist function count_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='counts') coord_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='mapping') clin_file <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='clinical') # Create STlist library('spatialGE') melanoma <- STlist(rnacounts=count_files, spotcoords=coord_files, samples=clin_file) melanoma <- transform_data(melanoma) }, error = function(e) { message("Could not run example. Are you connected to the internet?") return(NULL) })# Using included melanoma example (Thrane et al.) # Download example data set from spatialGE_Data thrane_tmp = tempdir() unlink(thrane_tmp, recursive=TRUE) dir.create(thrane_tmp) lk='https://github.com/FridleyLab/spatialGE_Data/raw/refs/heads/main/melanoma_thrane.zip?download=' tryCatch({ # In case data is not available from network download.file(lk, destfile=paste0(thrane_tmp, '/', 'melanoma_thrane.zip'), mode='wb') #' zip_tmp = list.files(thrane_tmp, pattern='melanoma_thrane.zip$', full.names=TRUE) unzip(zipfile=zip_tmp, exdir=thrane_tmp) # Generate the file paths to be passed to the STlist function count_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='counts') coord_files <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='mapping') clin_file <- list.files(paste0(thrane_tmp, '/melanoma_thrane'), full.names=TRUE, pattern='clinical') # Create STlist library('spatialGE') melanoma <- STlist(rnacounts=count_files, spotcoords=coord_files, samples=clin_file) melanoma <- transform_data(melanoma) }, error = function(e) { message("Could not run example. Are you connected to the internet?") return(NULL) })