lstar is a lightweight
interchange layer for single-cell and spatial omics. A
dataset is a set of axes (labelled sets you index by —
cells, genes, pca) and
fields (typed data over a tuple of axes — counts,
embeddings, graphs, labels), serialized to a portable
Zarr store that R, Python, and C++ all read and write.
Format conversion is then just write_Y(read_X(obj)) with
the L* store as the universal intermediate, and what a target cannot
hold is recorded in ds$dropped rather than silently
lost.
Everything below runs with only the base dependencies
(Matrix); no Seurat/SCE needed.
library(lstar)
cells <- paste0("c", 1:6); genes <- paste0("g", 1:4)
m <- as(matrix(as.numeric(1:24), 6, 4, dimnames = list(cells, genes)), "CsparseMatrix") # cells x genes
ds <- list(
kind = "sample",
axes = list(
cells = list(labels = cells, origin = "observed", role = "observation"),
genes = list(labels = genes, origin = "observed", role = "feature")),
fields = list(
counts = list(role = "measure", span = c("cells", "genes"), state = "raw", values = m),
cluster = list(role = "label", span = "cells", values = factor(c("a", "a", "b", "b", "a", "b")))))
class(ds) <- "lstar_dataset"
p <- tempfile(fileext = ".lstar.zarr")
lstar_write(ds, p) # -> a portable Zarr store (also readable from Python and C++)
ds2 <- lstar_read(p)
ds2
#> lstar_dataset (sample): 2 axes, 2 fields
#> axis cells 6
#> axis genes 4
#> field counts measure [cells x genes]
#> field cluster label [cells]A categorical label over cells induces a
factor axis whose labels are its categories, so
independent per-group results align on one axis.
The profiles map the shared-vocabulary core — counts, normalized/scaled expression, PCA (scores and gene loadings), UMAP/t-SNE, clusterings, cell/gene metadata — between formats. (Not evaluated here, to keep the vignette dependency-free.)
so <- write_seurat(ds) # L* dataset -> Seurat object
ds3 <- read_seurat(so) # Seurat -> L* dataset
sce <- write_sce(read_seurat(so)) # Seurat -> SingleCellExperiment, in one lineCross-language conversions go through the on-disk store — write it on one side, read it on the other, no shared memory and no format re-implementation:
lstar convert command lineThe Python package ships a one-command CLI that detects formats by
path, bridges R and Python through the store automatically, and reports
what crossed (and what was dropped):
lstar convert pbmc.h5ad pbmc.rds --report # AnnData -> Seurat, with a fidelity report
lstar convert pbmc.rds pbmc.h5ad --check # + open the result in its native library and smoke-test it--backend auto|native|direct adds a package-free
fallback: .h5ad converts with only
h5py (no anndata), and a Seurat .rds reads
and writes with base R + this package (no SeuratObject); an SCE
.rds reads package-free. See vignette
topics and the package website for the full conversion matrix. ```