Package: MantaID 1.0.4
Zhengpeng Zeng
MantaID: A Machine-Learning Based Tool to Automate the Identification of Biological Database IDs
The number of biological databases is growing rapidly, but different databases use different IDs to refer to the same biological entity. The inconsistency in IDs impedes the integration of various types of biological data. To resolve the problem, we developed 'MantaID', a data-driven, machine-learning based approach that automates identifying IDs on a large scale. The 'MantaID' model's prediction accuracy was proven to be 99%, and it correctly and effectively predicted 100,000 ID entries within two minutes. 'MantaID' supports the discovery and exploitation of ID patterns from large quantities of databases. (e.g., up to 542 biological databases). An easy-to-use freely available open-source software R package, a user-friendly web application, and API were also developed for 'MantaID' to improve applicability. To our knowledge, 'MantaID' is the first tool that enables an automatic, quick, accurate, and comprehensive identification of large quantities of IDs, and can therefore be used as a starting point to facilitate the complex assimilation and aggregation of biological data across diverse databases.
Authors:
MantaID_1.0.4.tar.gz
MantaID_1.0.4.tar.gz(r-4.5-noble)MantaID_1.0.4.tar.gz(r-4.4-noble)
MantaID_1.0.4.tgz(r-4.4-emscripten)
MantaID.pdf |MantaID.html✨
MantaID/json (API)
# Install 'MantaID' in R: |
install.packages('MantaID', repos = c('https://cran.r-universe.dev', 'https://cloud.r-project.org')) |
Bug tracker:https://github.com/molaison/mantaid/issues
Pkgdown site:https://molaison.github.io
- Example - ID example dataset.
- mi_data_attributes - ID-related datasets in biomart.
- mi_data_procID - Processed ID data.
- mi_data_rawID - ID dataset for testing.
Last updated 4 months agofrom:41b745f561. Checks:2 OK. Indexed: no.
Target | Result | Latest binary |
---|---|---|
Doc / Vignettes | OK | Jan 08 2025 |
R-4.5-linux | OK | Jan 08 2025 |
Exports:mimi_balance_datami_clean_datami_filter_featmi_get_confusionmi_get_IDmi_get_ID_attrmi_get_importancemi_get_missmi_get_padlenmi_plot_cormi_plot_heatmapmi_predict_newmi_run_bmrmi_split_colmi_split_strmi_to_numermi_train_BPmi_train_rgmi_train_rpmi_train_xgbmi_tune_rgmi_tune_rpmi_tune_xgbmi_unify_mod
Dependencies:AnnotationDbiaskpassbackportsbase64encbbotkBiobaseBiocFileCacheBiocGenericsbiomaRtBiostringsbitbit64blobcachemcaretcheckmateclasscliclockcodetoolscolorspaceconfigcpp11crayoncurldata.tableDBIdbplyrdbscandiagramdigestdplyre1071evaluatefansifarverfastmapfilelockFNNforeachfuturefuture.applygenericsGenomeInfoDbGenomeInfoDbDataggcorrplotggplot2globalsgluegowergtablehardhatherehmshttrhttr2igraphipredIRangesisobanditeratorsjsonliteKEGGRESTkerasKernSmoothlabelinglatticelavalgrlifecyclelistenvlubridatemagrittrMASSMatrixmclustmemoisemgcvmimemlbenchmlr3mlr3measuresmlr3miscmlr3tuningModelMetricsmunsellnlmennetnumDerivopensslpalmerpenguinsparadoxparallellypillarpkgconfigplogrplyrpngprettyunitspROCprocessxprodlimprogressprogressrproxyPRROCpspurrrR6rappdirsRColorBrewerRcppRcppTOMLrecipesreshape2reticulaterlangrpartrprojrootRSQLiterstudioapiS4VectorsscalesscutrshapesmotefamilySQUAREMstringistringrsurvivalsystensorflowtfautographtfrunstibbletidyrtidyselecttimechangetimeDatetzdbUCSC.utilsutf8uuidvctrsviridisLitewhiskerwithrxml2XVectoryamlzeallot