Package: datawizard 0.13.0
datawizard: Easy Data Wrangling and Statistical Transformations
A lightweight package to assist in key steps involved in any data analysis workflow: (1) wrangling the raw data to get it in the needed form, (2) applying preprocessing steps and statistical transformations, and (3) compute statistical summaries of data properties and distributions. It is also the data wrangling backend for packages in 'easystats' ecosystem. References: Patil et al. (2022) <doi:10.21105/joss.04684>.
Authors:
datawizard_0.13.0.tar.gz
datawizard_0.13.0.tar.gz(r-4.5-noble)datawizard_0.13.0.tar.gz(r-4.4-noble)
datawizard_0.13.0.tgz(r-4.4-emscripten)datawizard_0.13.0.tgz(r-4.3-emscripten)
datawizard.pdf |datawizard.html✨
datawizard/json (API)
NEWS
# Install 'datawizard' in R: |
install.packages('datawizard', repos = c('https://cran.r-universe.dev', 'https://cloud.r-project.org')) |
Bug tracker:https://github.com/easystats/datawizard/issues
Pkgdown:https://easystats.github.io
- efc - Sample dataset from the EFC Survey
- nhanes_sample - Sample dataset from the National Health and Nutrition Examination Survey
Last updated 3 months agofrom:b22acf6313. Checks:OK: 2. Indexed: no.
Target | Result | Date |
---|---|---|
Doc / Vignettes | OK | Dec 05 2024 |
R-4.5-linux | OK | Dec 05 2024 |
Exports:adjustassign_labelscategorizecentercentrechange_scalecoef_varcoerce_to_numericcolnames_to_rowcolumn_as_rownamescontr.deviationconvert_na_toconvert_to_nadata_addprefixdata_addsuffixdata_adjustdata_arrangedata_codebookdata_duplicateddata_extractdata_filterdata_groupdata_joindata_matchdata_mergedata_modifydata_partitiondata_peekdata_readdata_relocatedata_removedata_renamedata_rename_rowsdata_reorderdata_replicatedata_restoretypedata_rotatedata_seekdata_selectdata_separatedata_summarydata_tabulatedata_to_longdata_to_widedata_transposedata_ungroupdata_uniquedata_unitedata_writedegroupdemeandescribe_distributiondetrenddistribution_coef_vardistribution_modeempty_columnsempty_rowsextract_column_namesfind_columnskurtosislabels_to_levelsmean_sdmeans_by_groupmedian_madnormalizeprint_htmlprint_mdranktransformrecode_intorecode_valuesremove_emptyremove_empty_columnsremove_empty_rowsreplace_nan_infrescalerescale_weightsreshape_cireshape_longerreshape_widerreversereverse_scalerow_meansrow_to_colnamesrowid_as_columnrownames_as_columnskewnessslidesmoothnessstandardisestandardizetext_concatenatetext_formattext_fullstoptext_lastchartext_pastetext_removetext_wrapto_factorto_numericunnormalizeunstandardiseunstandardizevisualisation_recipeweighted_madweighted_meanweighted_medianweighted_sdwinsorize
Dependencies:insight
Readme and manuals
Help Manual
Help page | Topics |
---|---|
Adjust data for the effect of other variable(s) | adjust data_adjust |
Assign variable and value labels | assign_labels assign_labels.data.frame assign_labels.numeric |
Recode (or "cut" / "bin") data into groups of values. | categorize categorize.data.frame categorize.numeric |
Centering (Grand-Mean Centering) | center center.data.frame center.numeric centre |
Compute the coefficient of variation | coef_var coef_var.numeric distribution_coef_var distribution_cv |
Convert to Numeric (if possible) | coerce_to_numeric |
Deviation Contrast Matrix | contr.deviation |
Replace missing values in a variable or a data frame. | convert_na_to convert_na_to.character convert_na_to.data.frame convert_na_to.numeric |
Convert non-missing values in a variable into missing values. | convert_to_na convert_to_na.data.frame convert_to_na.factor convert_to_na.numeric |
Rename columns and variable names | data_addprefix data_addsuffix data_rename data_rename_rows |
Arrange rows by column values | data_arrange |
Generate a codebook of a data frame. | data_codebook print_html.data_codebook |
Extract all duplicates | data_duplicated |
Extract one or more columns or elements from an object | data_extract data_extract.data.frame |
Create a grouped data frame | data_group data_ungroup |
Return filtered or sliced data frame, or row indices | data_filter data_match |
Merge (join) two data frames, or a list of data frames | data_join data_merge data_merge.data.frame data_merge.list |
Create new variables in a data frame | data_modify data_modify.data.frame |
Partition data | data_partition |
Peek at values and type of variables in a data frame | data_peek data_peek.data.frame |
Read (import) data files from various sources | data_read data_write |
Relocate (reorder) columns of a data frame | data_relocate data_remove data_reorder |
Expand (i.e. replicate rows) a data frame | data_replicate |
Restore the type of columns according to a reference data frame | data_restoretype |
Rotate a data frame | data_rotate data_transpose |
Find variables by their names, variable or value labels | data_seek |
Find or get columns in a data frame based on search patterns | data_select extract_column_names find_columns |
Separate single variable into multiple variables | data_separate |
Summarize data | data_summary data_summary.data.frame |
Create frequency and crosstables of variables | as.data.frame.datawizard_tables data_tabulate data_tabulate.data.frame data_tabulate.default |
Reshape (pivot) data from wide to long | data_to_long reshape_longer |
Reshape (pivot) data from long to wide | data_to_wide reshape_wider |
Keep only one row from all with duplicated IDs | data_unique |
Unite ("merge") multiple variables | data_unite |
Compute group-meaned and de-meaned variables | degroup demean detrend |
Describe a distribution | describe_distribution describe_distribution.data.frame describe_distribution.factor describe_distribution.numeric |
Compute mode for a statistical distribution | distribution_mode |
Sample dataset from the EFC Survey | efc |
Convert value labels into factor levels | labels_to_levels labels_to_levels.data.frame labels_to_levels.factor |
Utility Function for Safe Prediction with 'datawizard' transformers | makepredictcall.dw_transformer |
Summary Helpers | mean_sd median_mad |
Summary of mean values by group | means_by_group means_by_group.data.frame means_by_group.numeric |
Sample dataset from the National Health and Nutrition Examination Survey | nhanes_sample |
Normalize numeric variable to 0-1 range | normalize normalize.data.frame normalize.numeric unnormalize unnormalize.data.frame unnormalize.grouped_df unnormalize.numeric |
(Signed) rank transformation | ranktransform ranktransform.data.frame ranktransform.numeric |
Recode values from one or more variables into a new variable | recode_into |
Recode old values of variables into new values | recode_values recode_values.data.frame recode_values.numeric |
Return or remove variables or observations that are completely missing | empty_columns empty_rows remove_empty remove_empty_columns remove_empty_rows |
Convert infinite or 'NaN' values into 'NA' | replace_nan_inf |
Rescale Variables to a New Range | change_scale rescale rescale.data.frame rescale.numeric |
Rescale design weights for multilevel analysis | rescale_weights |
Reshape CI between wide/long formats | reshape_ci |
Reverse-Score Variables | reverse reverse.data.frame reverse.numeric reverse_scale |
Row means (optionally with minimum amount of valid values) | row_means |
Tools for working with column names | colnames_to_row row_to_colnames |
Tools for working with row names or row ids | column_as_rownames rowid_as_column rownames_as_column |
Compute Skewness and (Excess) Kurtosis | kurtosis kurtosis.numeric print.parameters_kurtosis print.parameters_skewness skewness skewness.numeric summary.parameters_kurtosis summary.parameters_skewness |
Shift numeric value range | slide slide.data.frame slide.numeric |
Quantify the smoothness of a vector | smoothness |
Standardization (Z-scoring) | standardise standardize standardize.data.frame standardize.factor standardize.numeric unstandardise unstandardize unstandardize.data.frame unstandardize.numeric |
Re-fit a model with standardized data | standardize.default standardize_models |
Convenient text formatting functionalities | text_concatenate text_format text_fullstop text_lastchar text_paste text_remove text_wrap |
Convert data to factors | to_factor to_factor.data.frame to_factor.numeric |
Convert data to numeric | to_numeric to_numeric.data.frame |
Prepare objects for visualisation | visualisation_recipe |
Weighted Mean, Median, SD, and MAD | weighted_mad weighted_mean weighted_median weighted_sd |
Winsorize data | winsorize winsorize.numeric |