Package: dataprep 0.1.5

Chun-Sheng Liang

dataprep: Efficient and Flexible Data Preprocessing Tools

Efficiently and flexibly preprocess data using a set of data filtering, deletion, and interpolation tools. These data preprocessing methods are developed based on the principles of completeness, accuracy, threshold method, and linear interpolation and through the setting of constraint conditions, time completion & recovery, and fast & efficient calculation and grouping. Key preprocessing steps include deletions of variables and observations, outlier removal, and missing values (NA) interpolation, which are dependent on the incomplete and dispersed degrees of raw data. They clean data more accurately, keep more samples, and add no outliers after interpolation, compared with ordinary methods. Auto-identification of consecutive NA via run-length based grouping is used in observation deletion, outlier removal, and NA interpolation; thus, new outliers are not generated in interpolation. Conditional extremum is proposed to realize point-by-point weighed outlier removal that saves non-outliers from being removed. Plus, time series interpolation with values to refer to within short periods further ensures reliable interpolation. These methods are based on and improved from the reference: Liang, C.-S., Wu, H., Li, H.-Y., Zhang, Q., Li, Z. & He, K.-B. (2020) <doi:10.1016/j.scitotenv.2020.140923>.

Authors:Chun-Sheng Liang <[email protected]>, Hao Wu, Hai-Yan Li, Qiang Zhang, Zhanqing Li, Ke-Bin He, Lanzhou University, Tsinghua University

dataprep_0.1.5.tar.gz
dataprep_0.1.5.tar.gz(r-4.5-noble)dataprep_0.1.5.tar.gz(r-4.4-noble)
dataprep_0.1.5.tgz(r-4.4-emscripten)dataprep_0.1.5.tgz(r-4.3-emscripten)
dataprep.pdf |dataprep.html
dataprep/json (API)

# Install 'dataprep' in R:
install.packages('dataprep', repos = c('https://cran.r-universe.dev', 'https://cloud.r-project.org'))

Peer review:

This package does not link to any Github/Gitlab/R-forge repository. No issue tracker or development information is available.

2.00 score 2 scripts 310 downloads 15 exports 42 dependencies

Last updated 3 years agofrom:0ac3784947. Checks:OK: 2. Indexed: yes.

TargetResultDate
Doc / VignettesOKNov 17 2024
R-4.5-linuxOKNov 17 2024

Exports:condextrdatadata1dataprepdescdatadescplotmeltobsedeleoptisolupercdatapercoutlpercplotshorvaluvaridelezerona

Dependencies:clicodetoolscolorspacedata.tabledoParalleldplyrfansifarverforeachgenericsggplot2gluegtableisobanditeratorslabelinglatticelifecyclemagrittrMASSMatrixmgcvmunsellnlmepillarpkgconfigplyrR6RColorBrewerRcppreshape2rlangscalesstringistringrtibbletidyselectutf8vctrsviridisLitewithrzoo

dataprep: data preprocessing and plots

Rendered fromvignettes.Rmdusingknitr::rmarkdownon Nov 17 2024.

Last update: 2021-01-11
Started: 2021-01-11