| Title: | Detection and Spatial Analysis of Tertiary Lymphoid Structures |
|---|---|
| Description: | Fast, reproducible detection and quantitative analysis of tertiary lymphoid structures (TLS) in multiplexed tissue imaging. Implements Independent Component Analysis Trace (ICAT) index, local Ripley's K scanning, automated K Nearest Neighbor (KNN)-based TLS detection, and T-cell clusters identification as described in Amiryousefi et al. (2025) <doi:10.1101/2025.09.21.677465>. |
| Authors: | Ali Amiryousefi [aut, cre] (ORCID: <https://orcid.org/0000-0002-6317-3860>), Jeremiah Wala [aut] (ORCID: <https://orcid.org/0000-0001-6591-1620>), Peter Sorger [ctb] (ORCID: <https://orcid.org/0000-0002-3364-1838>) |
| Maintainer: | Ali Amiryousefi <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.3.0 |
| Built: | 2026-05-24 10:16:58 UTC |
| Source: | https://github.com/cran/tlsR |
Fast, reproducible detection and quantitative analysis of tertiary lymphoid structures (TLS) in multiplexed tissue imaging data.
Load or prepare a named list of data frames (ldata), one per
tissue sample. Each data frame must contain columns x, y
(spatial coordinates in microns), and phenotype (character:
"B cell" / "T cell" / other).
Run detect_TLS to label B+T co-localised regions.
(Optional) Run scan_clustering to identify windows of
significant immune clustering via local Ripley's L.
Run calc_icat to score the internal linearity/organisation
of each detected TLS.
Run detect_tic to identify T-cell clusters outside TLS.
Use summarize_TLS to obtain a tidy summary table.
Use plot_TLS to produce publication-ready spatial plots.
Maintainer: Ali Amiryousefi [email protected] (ORCID)
Authors:
Jeremiah Wala [email protected] (ORCID)
Other contributors:
Peter Sorger (ORCID) [contributor]
Amiryousefi et al. (2025) doi:10.1101/2025.09.21.677465
Useful links:
Quantifies the spatial spread and linear organisation of cells within a detected TLS. FastICA is applied to the (x, y) coordinates of TLS cells to estimate independent components; the mixing matrix is used to reconstruct the data, and the ICAT index is defined as the normalised trace-standard-deviation of the reconstructed coordinates.
The index is always non-negative because it measures the average spatial spread per cell rather than the signed trace of the mixing matrix (which can be negative due to ICA sign ambiguity). Higher values indicate a more spatially extended, structured cluster.
calc_icat(patientID, tlsID, ldata = NULL)calc_icat(patientID, tlsID, ldata = NULL)
patientID |
Character. Sample name in |
tlsID |
Numeric or integer. TLS identifier (value of |
ldata |
Named list of data frames, or |
The ICAT index is computed as follows:
Centre the (x, y) coordinates of TLS cells.
Run fastICA with 2 components.
Reconstruct .
Let be the marginal variances of .
If the requested TLS contains fewer than 3 cells, or FastICA does not
converge, the function returns NA_real_ with an informative message
rather than throwing an error.
A single non-negative numeric value (the ICAT index), or
NA_real_ if computation is not possible (fewer than 3 cells, or
FastICA did not converge).
data(toy_ldata) ldata <- detect_TLS("ToySample", k = 30, ldata = toy_ldata) if (max(ldata[["ToySample"]]$tls_id_knn, na.rm = TRUE) > 0) { icat <- calc_icat("ToySample", tlsID = 1, ldata = ldata) print(icat) }data(toy_ldata) ldata <- detect_TLS("ToySample", k = 30, ldata = toy_ldata) if (max(ldata[["ToySample"]]$tls_id_knn, na.rm = TRUE) > 0) { icat <- calc_icat("ToySample", tlsID = 1, ldata = ldata) print(icat) }
Applies HDBSCAN to T cells that lie outside of previously detected TLS
regions to identify spatially compact T-cell clusters (TIC). Phenotype
labels "T cell" and "T cells" are both accepted.
detect_tic(sample, min_pts = 10L, min_cluster_size = 10L, ldata = NULL)detect_tic(sample, min_pts = 10L, min_cluster_size = 10L, ldata = NULL)
sample |
Character. Sample name in |
min_pts |
Integer. HDBSCAN |
min_cluster_size |
Integer. Minimum number of T cells for a HDBSCAN
cluster to be retained; smaller clusters are merged back into noise
(label |
ldata |
Named list of data frames, or |
The input ldata list with the sample data frame augmented by
one new column:
tcell_cluster_hdbscanInteger. 0 = noise / not a
T-cell cluster; positive integer = TIC cluster ID. Non-T-cell rows
receive NA.
data(toy_ldata) ldata <- detect_TLS("ToySample", k = 30, ldata = toy_ldata) ldata <- detect_tic("ToySample", ldata = ldata) table(ldata[["ToySample"]]$tcell_cluster_hdbscan, useNA = "ifany")data(toy_ldata) ldata <- detect_TLS("ToySample", k = 30, ldata = toy_ldata) ldata <- detect_tic("ToySample", ldata = ldata) table(ldata[["ToySample"]]$tcell_cluster_hdbscan, useNA = "ifany")
Identifies TLS candidates by finding regions of high local B-cell density
that also contain a sufficient number of nearby T cells (B+T
co-localisation). Phenotype labels "B cell" and "B cells"
(and their T-cell equivalents) are both accepted.
detect_TLS( LSP, ldata, k = 30L, bcell_density_threshold = 10, min_B_cells = 50L, min_T_cells_nearby = 10L, max_distance_T = 50, expand_distance = 80 )detect_TLS( LSP, ldata, k = 30L, bcell_density_threshold = 10, min_B_cells = 50L, min_T_cells_nearby = 10L, max_distance_T = 50, expand_distance = 80 )
LSP |
Character. Sample name in |
k |
Integer. Number of nearest neighbours used for density estimation
(default |
bcell_density_threshold |
Numeric. Minimum average 1/k-distance (in
microns) for a B cell to be considered locally dense (default |
min_B_cells |
Integer. Minimum B cells per candidate TLS cluster
(default |
min_T_cells_nearby |
Integer. Minimum T cells within
|
max_distance_T |
Numeric. Search radius (microns) for T-cell proximity
check (default |
expand_distance |
Integer. The extended values from the boundary of the deteced B-cells clusters that the T cells are bieng integrated (default |
ldata |
Named list of data frames, or |
The similarly formatted ldata list, with the data frame for LSP
augmented by three new columns:
tls_id_knnInteger. 0 = non-TLS cell; positive
integer = TLS cluster ID.
tls_center_xNumeric. X coordinate of the TLS centre for
TLS cells; NA otherwise.
tls_center_yNumeric. Y coordinate of the TLS centre for
TLS cells; NA otherwise.
# Use a 70% sample of the data to keep CRAN check time under 10s. # TLS detection requires sufficient cell density; 70% preserves # the spatial structure needed for reliable detection. # For production use, run on the full dataset (see \donttest{} below). data(toy_ldata) set.seed(42) idx <- sample(nrow(toy_ldata[["ToySample"]]), size = floor(0.7 * nrow(toy_ldata[["ToySample"]]))) sub_ldata <- list(ToySample = toy_ldata[["ToySample"]][idx, ]) ldata <- detect_TLS("ToySample", k = 30, ldata = sub_ldata) table(ldata[["ToySample"]]$tls_id_knn) plot(ldata[["ToySample"]]$x, ldata[["ToySample"]]$y, col = ifelse(ldata[["ToySample"]]$tls_id_knn > 0, "red", "gray"), pch = 19, cex = 0.5, main = "Detected TLS (70% sample)") # Full dataset with default settings data(toy_ldata) ldata <- detect_TLS("ToySample", k = 30, ldata = toy_ldata) table(ldata[["ToySample"]]$tls_id_knn) plot(ldata[["ToySample"]]$x, ldata[["ToySample"]]$y, col = ifelse(ldata[["ToySample"]]$tls_id_knn > 0, "red", "gray"), pch = 19, cex = 0.5, main = "Detected TLS in toy data")# Use a 70% sample of the data to keep CRAN check time under 10s. # TLS detection requires sufficient cell density; 70% preserves # the spatial structure needed for reliable detection. # For production use, run on the full dataset (see \donttest{} below). data(toy_ldata) set.seed(42) idx <- sample(nrow(toy_ldata[["ToySample"]]), size = floor(0.7 * nrow(toy_ldata[["ToySample"]]))) sub_ldata <- list(ToySample = toy_ldata[["ToySample"]][idx, ]) ldata <- detect_TLS("ToySample", k = 30, ldata = sub_ldata) table(ldata[["ToySample"]]$tls_id_knn) plot(ldata[["ToySample"]]$x, ldata[["ToySample"]]$y, col = ifelse(ldata[["ToySample"]]$tls_id_knn > 0, "red", "gray"), pch = 19, cex = 0.5, main = "Detected TLS (70% sample)") # Full dataset with default settings data(toy_ldata) ldata <- detect_TLS("ToySample", k = 30, ldata = toy_ldata) table(ldata[["ToySample"]]$tls_id_knn) plot(ldata[["ToySample"]]$x, ldata[["ToySample"]]$y, col = ifelse(ldata[["ToySample"]]$tls_id_knn > 0, "red", "gray"), pch = 19, cex = 0.5, main = "Detected TLS in toy data")
Produces a ggplot2 scatter plot of cell positions, coloured by
TLS membership, T-cell cluster membership, and background phenotype.
Background (non-TLS, non-TIC) cells are rendered with a lower alpha to keep them visually recessive, while TIC cells are drawn slightly larger than TLS cells so they stand out without dominating the plot.
plot_TLS( sample, ldata = NULL, show_tic = TRUE, point_size = 0.5, alpha = 0.7, bg_alpha = 0.25, tic_size_mult = 1.8, tls_palette = c("#0072B2", "#009E73", "#CC79A7", "#D55E00", "#56B4E9", "#F0E442"), tic_colour = "#E69F00", bg_colour = "grey80" )plot_TLS( sample, ldata = NULL, show_tic = TRUE, point_size = 0.5, alpha = 0.7, bg_alpha = 0.25, tic_size_mult = 1.8, tls_palette = c("#0072B2", "#009E73", "#CC79A7", "#D55E00", "#56B4E9", "#F0E442"), tic_colour = "#E69F00", bg_colour = "grey80" )
sample |
Character. Sample name in |
ldata |
Named list of data frames, or |
show_tic |
Logical. Colour T-cell clusters (if |
point_size |
Numeric. Base point size for TLS cells and background
cells (default |
alpha |
Numeric. Point transparency for TLS and TIC cells
(default |
bg_alpha |
Numeric. Point transparency for background (non-TLS,
non-TIC) cells (default |
tic_size_mult |
Numeric. Multiplier applied to |
tls_palette |
Character vector of colours for TLS IDs. Recycled if there are more TLS than colours. Default uses a colourblind-friendly palette. |
tic_colour |
Character. Colour for T-cell cluster cells
(default |
bg_colour |
Character. Colour for non-TLS, non-TIC cells
(default |
A ggplot object (invisibly). The plot is also printed unless
the return value is assigned.
data(toy_ldata) ldata <- detect_TLS("ToySample", k = 30, ldata = toy_ldata) p <- plot_TLS("ToySample", ldata = ldata)data(toy_ldata) ldata <- detect_TLS("ToySample", k = 30, ldata = toy_ldata) p <- plot_TLS("ToySample", ldata = ldata)
Applies a sliding-window Ripley's L analysis across the tissue to produce
a spatial clustering map. For each window a K-integral index is
computed as the mean positive excess of the observed L function over its
theoretical CSR value. When plot = TRUE a base-graphics spatial
map is drawn with LOESS-smoothed L-excess curves and numeric CI labels
overlaid inside each qualifying window, plus a legend identifying point
and curve colours.
scan_clustering( ws = 500, sample, phenotype = c("T cells", "B cells", "Both"), plot = TRUE, creep = 1L, min_cells = 10L, min_phen_cells = 5L, label_cex = 1.1, ldata = NULL )scan_clustering( ws = 500, sample, phenotype = c("T cells", "B cells", "Both"), plot = TRUE, creep = 1L, min_cells = 10L, min_phen_cells = 5L, label_cex = 1.1, ldata = NULL )
ws |
Numeric. Window side length in microns (default |
sample |
Character. Sample name in |
phenotype |
One of |
plot |
Logical. Draw the spatial clustering map?
(default |
creep |
Integer. Grid density factor; |
min_cells |
Integer. Minimum total cell count required in a window
before it is analysed (default |
min_phen_cells |
Integer. Minimum phenotype-specific cell count per
window (default |
label_cex |
Numeric. Base character expansion for the CI numeric
labels drawn inside each window (default |
ldata |
Named list of data frames, or |
The K-integral clustering index for window is:
where is the number of spatial lags where the observed L exceeds
the theoretical CSR value.
When plot = TRUE the map shows:
All cells as small light-grey points.
Phenotype cells (T cells green, B cells red).
Navy dashed grid lines marking window boundaries.
A LOESS-smoothed L-excess curve inside each qualifying window.
A bold numeric CI label centred in the window.
A legend identifying all point and curve colours.
When phenotype = "Both" two side-by-side panels are produced -
one for B cells and one for T cells - so the two clustering maps can be
compared directly on the same spatial layout.
A named list with elements B and/or T (depending on
phenotype), each containing the Lest objects for all
qualifying windows of that phenotype. Returned invisibly when
plot = TRUE.
data(toy_ldata) L_models <- scan_clustering( ws = 200, sample = "ToySample", phenotype = "B cells", plot = TRUE, ldata = toy_ldata ) cat("B-cell windows analysed:", length(L_models$B), "\n") # Side-by-side B and T cell panels L_both <- scan_clustering( ws = 200, sample = "ToySample", phenotype = "Both", plot = TRUE, ldata = toy_ldata )data(toy_ldata) L_models <- scan_clustering( ws = 200, sample = "ToySample", phenotype = "B cells", plot = TRUE, ldata = toy_ldata ) cat("B-cell windows analysed:", length(L_models$B), "\n") # Side-by-side B and T cell panels L_both <- scan_clustering( ws = 200, sample = "ToySample", phenotype = "Both", plot = TRUE, ldata = toy_ldata )
Produces a tidy data.frame with one row per sample summarising the
number of detected TLS, their sizes, and (optionally) ICAT scores.
summarize_TLS(ldata, calc_icat_scores = FALSE)summarize_TLS(ldata, calc_icat_scores = FALSE)
ldata |
Named list of data frames as returned by |
calc_icat_scores |
Logical. Should ICAT scores be computed for each TLS
and appended as a list-column? Default |
A data.frame with columns:
sampleSample name.
n_TLSNumber of TLS detected.
total_cellsTotal cells in the sample.
TLS_cellsNumber of cells assigned to any TLS.
TLS_fractionFraction of all cells that are TLS cells.
mean_TLS_sizeMean cells per TLS (NA if n_TLS = 0).
n_TICNumber of T-cell clusters detected by
detect_tic (NA if not yet run).
icat_scoresList-column of ICAT scores per TLS (only when
calc_icat_scores = TRUE).
data(toy_ldata) ldata <- detect_TLS("ToySample", k = 30, ldata = toy_ldata) summarize_TLS(ldata)data(toy_ldata) ldata <- detect_TLS("ToySample", k = 30, ldata = toy_ldata) summarize_TLS(ldata)
A small synthetic dataset mimicking multiplexed tissue imaging data, used in
package examples and tests. The list contains one sample named
"ToySample".
toy_ldatatoy_ldata
A named list with one element:
ToySampleA data.frame with the following columns:
xNumeric. X coordinate in microns.
yNumeric. Y coordinate in microns.
phenotypeCharacter. Cell phenotype label. Values are
"B cell", "T cell", and "Other".
Synthetically generated for package examples.
Amiryousefi et al. (2025) doi:10.1101/2025.09.21.677465
data(toy_ldata) str(toy_ldata) table(toy_ldata[["ToySample"]]$phenotype)data(toy_ldata) str(toy_ldata) table(toy_ldata[["ToySample"]]$phenotype)