Package: BigDataStatMeth 2.0.2

Dolors Pelegri-Siso

BigDataStatMeth: Scalable Statistical Computing with HDF5-Backed Matrices

A framework for 'scalable' statistical computing on large on-disk matrices stored in 'HDF5' files. It provides efficient block-wise implementations of core linear-algebra operations (matrix multiplication, SVD, PCA, QR decomposition, and canonical correlation analysis) written in C++ and R. These building blocks are designed not only for direct use, but also as foundational components for developing new statistical methods that must operate on datasets too large to fit in memory. The package supports data provided either as 'HDF5' files or standard R objects, and is intended for high-dimensional applications such as 'omics' and precision-medicine research.

Authors:Dolors Pelegri-Siso [aut, cre], Juan R. Gonzalez [aut]

BigDataStatMeth_2.0.2.tar.gz
BigDataStatMeth_2.0.2.tar.gz(r-4.7-arm64)BigDataStatMeth_2.0.2.tar.gz(r-4.7-x86_64)BigDataStatMeth_2.0.2.tar.gz(r-4.6-arm64)BigDataStatMeth_2.0.2.tar.gz(r-4.6-x86_64)
manual.pdf |manual.html
card.svg |card.png
BigDataStatMeth/json (API)
NEWS

# Install 'BigDataStatMeth' in R:
install.packages('BigDataStatMeth', repos = c('https://cran.r-universe.dev', 'https://cloud.r-project.org'))
Uses libs:
  • openblas– Optimized BLAS
  • curl– Easy-to-use client-side URL transfer library
  • openssl– Secure Sockets Layer toolkit
  • c++– GNU Standard C++ Library v3
  • openmp– GCC OpenMP (GOMP) support library
Datasets:

On CRAN:

Conda:

This package does not link to any Github/Gitlab/R-forge repository. No issue tracker or development information is available.

openblascurlopensslcppopenmp

3.40 score 9 scripts 454 downloads 79 exports 10 dependencies

Last updated from:573b90d392. Checks:4 NOTE, 1 OK, 1 FAIL. Indexed: no.

TargetResultTimeFilesSyslog
linux-devel-arm64NOTE1121
linux-devel-x86_64NOTE1428
source / vignettesOK1954
linux-release-arm64NOTE1202
linux-release-x86_64NOTE1164
wasm-releaseFAIL662

Exports:%*%apply_functionbd_wproductbdapply_Function_hdf5bdblockMultbdblockSubstractbdblockSumbdCorr_matrixbdCreate_hdf5_groupbdCreate_hdf5_matrixbdCrossprodbdgetDatasetsList_hdf5bdImportData_hdf5bdImportTextFile_hdf5bdmove_hdf5_datasetbdpseudoinvbdpseudoinv_hdf5bdReduce_hdf5_datasetbdScalarwproductbdtCrossprodbdWrite_hdf5_dimnamescan_allocatecolMaxscolMeanscolMinscolnames.HDF5Matrixcolnames<-.HDF5MatrixcolSdscolSumscolVarscorcrossproddiagdiag_opdiag_scalediag<-eigenfilter_low_coveragefilter_mafget_available_ramget_cpu_coresget_memory_thresholdsget_recommended_threadsget_total_ramhdf5_applyhdf5_close_allhdf5_close_filehdf5_create_matrixhdf5_importhdf5_import_multiplehdf5_matrixhdf5_reducehdf5matrix_optionsimpute_snpsis_openlist_datasetsmemory_infomultiply_sparseobject_sizepseudoinverseqrreducerowMaxsrowMeansrowMinsrownames.HDF5Matrixrownames<-.HDF5MatrixrowSdsrowSumsrowVarsscalesdshow_hdf5matrix_optionssplit_datasetsvdsweepsystem_infotcrossprodvar

Dependencies:biocmakebitopsdata.tabledir.expiryfilelockR6RcppRcppEigenRCurlRhdf5lib

Working with HDF5-Backed Matrices in BigDataStatMeth

Rendered fromBigDataStatMeth.Rmdusingknitr::rmarkdownon Jun 08 2026.

Last update: 2026-06-08
Started: 2025-11-29

Readme and manuals

Help Manual

Help pageTopics
Subset an HDF5Matrix[.HDF5Matrix
Subsetting assignment for HDF5Matrix objects[<-.HDF5Matrix
Matrix multiplication for HDF5Matrix%*%
Apply a statistical or algebraic function to HDF5 datasets (generic)apply_function
Convert HDF5Matrix to data.frameas.data.frame.HDF5Matrix
Convert HDF5Matrix to in-memory matrixas.matrix.HDF5Matrix
Weighted matrix–vector products and cross-productsbd_wproduct
Apply function to different datasets inside a groupbdapply_Function_hdf5
Block-Based Matrix MultiplicationbdblockMult
Block-Based Matrix SubtractionbdblockSubstract
Block-Based Matrix AdditionbdblockSum
Compute correlation matrix for in-memory matrices (unified function)bdCorr_matrix
Create Group in an HDF5 FilebdCreate_hdf5_group
Create HDF5 data file and write data to itbdCreate_hdf5_matrix
Efficient Matrix Cross-Product ComputationbdCrossprod
List Datasets in HDF5 GroupbdgetDatasetsList_hdf5
Import data from URL or file to HDF5 formatbdImportData_hdf5
Import Text File to HDF5bdImportTextFile_hdf5
Move HDF5 Datasetbdmove_hdf5_dataset
Compute Matrix Pseudoinverse (In-Memory)bdpseudoinv
Compute Matrix Pseudoinverse (HDF5-Stored)bdpseudoinv_hdf5
Reduce Multiple HDF5 DatasetsbdReduce_hdf5_dataset
Matrix–scalar weighted productbdScalarwproduct
Efficient Matrix Transposed Cross-Product ComputationbdtCrossprod
Write dimnames to an HDF5 datasetbdWrite_hdf5_dimnames
BigDataStatMeth: Scalable statistical computing with R, C++, and HDF5BigDataStatMeth
Check if memory allocation is safecan_allocate
Cancer classificationcancer
Column-bind HDF5Matrix objectscbind.HDF5Matrix
Cholesky decomposition of a symmetric positive-definite HDF5Matrixchol.HDF5Matrix
Close HDF5Matrixclose.HDF5Matrix
Dataset colesterolcolesterol
Column and row maximums for HDF5MatrixcolMaxs colMaxs.HDF5Matrix rowMaxs rowMaxs.HDF5Matrix
Column and row means for HDF5MatrixcolMeans colMeans.HDF5Matrix rowMeans rowMeans.HDF5Matrix
Column and row minimums for HDF5MatrixcolMins colMins.HDF5Matrix rowMins rowMins.HDF5Matrix
Column and row standard deviations for HDF5MatrixcolSds colSds.HDF5Matrix rowSds rowSds.HDF5Matrix
Column and row sums for HDF5MatrixcolSums colSums.HDF5Matrix rowSums rowSums.HDF5Matrix
Column and row variances for HDF5MatrixcolVars colVars.HDF5Matrix rowVars rowVars.HDF5Matrix
Correlation (generic)cor
Correlation matrix for HDF5Matrix objectscor.HDF5Matrix
Cross product of HDF5Matrix objectscrossprod crossprod.HDF5Matrix
Extract or construct a diagonal for HDF5Matrixdiag diag.default diag.HDF5Matrix
Diagonal-vector operation on an HDF5Matrixdiag_op diag_op.HDF5Matrix
Scalar diagonal operation on an HDF5Matrixdiag_scale diag_scale.HDF5Matrix
Set diagonal of an HDF5Matrix (generic)diag<-
Dimensions of an HDF5Matrixdim.HDF5Matrix
Get dimension names of an HDF5Matrixcolnames.HDF5Matrix colnames<-.HDF5Matrix dimnames.HDF5Matrix rownames.HDF5Matrix rownames<-.HDF5Matrix
Set dimension names on an HDF5Matrixdimnames<-.HDF5Matrix
Spectral decompositioneigen eigen.default eigen.HDF5Matrix
Remove high-missingness features from an HDF5Matrixfilter_low_coverage filter_low_coverage.HDF5Matrix
Remove SNPs by Minor Allele Frequency from an HDF5Matrixfilter_maf filter_maf.HDF5Matrix
Get available (free) system RAMget_available_ram
Get number of CPU coresget_cpu_cores
Get dynamic memory thresholds based on system RAMget_memory_thresholds
Get recommended number of threads for parallel operationsget_recommended_threads
Get total system RAMget_total_ram
Apply a mathematical operation to multiple HDF5 datasetshdf5_apply
Close all HDF5Matrix objectshdf5_close_all
Close all HDF5 handles for a specific filehdf5_close_file
Create an HDF5 dataset and return an HDF5Matrix objecthdf5_create_matrix
Import data from file or URL into HDF5 formathdf5_import
Import multiple files into HDF5hdf5_import_multiple
Open an HDF5 dataset as an HDF5Matrix objecthdf5_matrix
Reduce all datasets in an HDF5 group by a binary operationhdf5_reduce
Set or get HDF5Matrix computation optionshdf5matrix_options
S3 methods for HDF5MatrixHDF5Matrix-S3
Summary statistics for HDF5MatrixHDF5Matrix-scalar-aggregations mean.HDF5Matrix Summary.HDF5Matrix
Impute missing SNP values in an HDF5Matriximpute_snps impute_snps.HDF5Matrix
Check if HDF5Matrix is openis_open
Length of an HDF5Matrixlength.HDF5Matrix
List datasets in an HDF5 file or grouplist_datasets
Print system memory informationmemory_info
miRNAmiRNA
Sparse-aware matrix multiplication (generic)multiply_sparse
Sparse-aware matrix multiplication for HDF5Matrixmultiply_sparse.HDF5Matrix
Get memory size of HDF5Matrix without loadingobject_size
Elementwise arithmetic operators for HDF5Matrix objectsOps.HDF5Matrix
Principal Component Analysis of an HDF5Matrixprcomp.HDF5Matrix
Print an HDF5Matrix objectprint.HDF5Matrix
Print method for HDF5PCA objectsprint.HDF5PCA
Moore-Penrose pseudoinversepseudoinverse pseudoinverse.default pseudoinverse.HDF5Matrix
QR decomposition of an HDF5Matrixqr
QR decomposition of an HDF5Matrixqr.HDF5Matrix
Row-bind HDF5Matrix objectsrbind.HDF5Matrix
Reduce a group of HDF5 datasets by accumulation (generic)reduce
Scale / normalize an HDF5Matrixscale scale.HDF5Matrix
Standard deviation of all elements of an HDF5Matrixsd sd.HDF5Matrix
Show current HDF5Matrix performance settingsshow_hdf5matrix_options
Matrix inverse of a symmetric positive-definite HDF5Matrix via Choleskysolve.HDF5Matrix
Split an HDF5Matrix into multiple block datasetssplit_dataset split_dataset.HDF5Matrix
Split an HDF5Matrix into a list of blockssplit.HDF5Matrix
Structure of an HDF5Matrix objectstr.HDF5Matrix
Singular Value Decomposition (generic)svd
Singular Value Decomposition of an HDF5Matrixsvd.HDF5Matrix
Sweep out array summaries (generic)sweep
Broadcast a vector over an HDF5Matrix (sweep)sweep.HDF5Matrix
Get system information summarysystem_info
Transposed cross product of HDF5Matrix objectstcrossprod tcrossprod.HDF5Matrix
Variance of all elements of an HDF5Matrixvar var.HDF5Matrix