Package: podcleaner 0.1.2

Olivier Bautheac

podcleaner: Legacy Scottish Post Office Directories Cleaner

Attempts to clean optical character recognition (OCR) errors in legacy Scottish Post Office Directories. Further attempts to match records from trades and general directories.

Authors:Olivier Bautheac [aut, cre], University of Strathclyde [cph, fnd]

podcleaner_0.1.2.tar.gz
podcleaner_0.1.2.tar.gz(r-4.5-noble)podcleaner_0.1.2.tar.gz(r-4.4-noble)
podcleaner_0.1.2.tgz(r-4.4-emscripten)podcleaner_0.1.2.tgz(r-4.3-emscripten)
podcleaner.pdf |podcleaner.html
podcleaner/json (API)
NEWS

# Install 'podcleaner' in R:
install.packages('podcleaner', repos = c('https://cran.r-universe.dev', 'https://cloud.r-project.org'))

Peer review:

This package does not link to any Github/Gitlab/R-forge repository. No issue tracker or development information is available.

1.70 score 4 scripts 128 downloads 7 exports 37 dependencies

Last updated 3 years agofrom:4f05b165f1. Checks:OK: 2. Indexed: yes.

TargetResultDate
Doc / VignettesOKNov 07 2024
R-4.5-linuxOKNov 07 2024

Exports:combine_match_general_to_tradesgeneral_clean_directorytrades_clean_directoryutils_IO_writeutils_load_directories_csvutils_make_fileutils_make_path

Dependencies:bitbit64clicliprcpp11crayondplyrfansifuzzyjoingenericsgeospheregluehmslatticelifecyclemagrittrpillarpkgconfigprettyunitsprogresspurrrR6Rcppreadrrlangspstringdiststringistringrtibbletidyrtidyselecttzdbutf8vctrsvroomwithr

Readme and manuals

Help Manual

Help pageTopics
Clean attached words in address entry(/ies)clean_address_attached_words
Clean address entry(/ies) bodyclean_address_body
Clean ends in address entry(/ies)clean_address_ends
Standardise "Mac" prefix in address entry(/ies)clean_address_mac
Clean place name(s) in address entry(/ies)clean_address_names
Clean address entry numbersclean_address_number
Miscellaneous cleaning operations in address entry(/ies)clean_address_others
Clean places in address entry(/ies)clean_address_places
Standardise possessives in address entry(/ies)clean_address_possessives
Post-cleaning operation for address entry(/ies)clean_address_post_clean
Pre-cleaning operation for address entry(/ies)clean_address_pre_clean
Clean "Saint" prefix in address entry(/ies)clean_address_saints
Clean unwanted suffixes in address entry(/ies)clean_address_suffixes
Clean worksites in address entry(/ies)clean_address_worksites
Clean entry(/ies) forenameclean_forename
Standardise punctuation in forename(s)clean_forename_punctuation
Separate double-barrelled forename(s)clean_forename_separate_words
Clean forename(s) spellingclean_forename_spelling
Standardise "Mac" prefix in people's nameclean_mac
Clean ends in entry(/ies) namesclean_name_ends
Clean entry(/ies) occupationclean_occupation
Clean entry(/ies) of in brackets informationclean_parentheses
Clean entry(/ies) special charactersclean_specials
Clean string endsclean_string_ends
Clean entry(/ies) surnameclean_surname
Standardise punctuation in surname(s)clean_surname_punctuation
Clean surname(s) spellingclean_surname_spelling
Clean entry(/ies) name titleclean_title
Get house address column typecombine_get_address_house_type
Check for failed matchescombine_has_match_failed
Label failed matchescombine_label_failed_matches
Label failed matchescombine_label_if_match_failed
Mutate operation(s) in directory data.frame trade address columncombine_make_match_string
Match general to trades directory recordscombine_match_general_to_trades
Match general to trades directory recordscombine_match_general_to_trades_plain
Match general to trades directory recordscombine_match_general_to_trades_progress
Mutate operation(s) in directory data.frame address.trade column.combine_no_trade_address_to_random_string
Conditionally return a random stringcombine_random_string_if_no_address
Conditionally return a random stringcombine_random_string_if_pattern
Mutate operation(s) in Scottish post office general directory data.frame column(s)general_clean_directory
Mutate operation(s) in Scottish post office general directory data.frame column(s)general_clean_directory_plain
Mutate operation(s) in Scottish post office general directory data.frame column(s)general_clean_directory_progress
Mutate operation(s) in Scottish post office general directory data.frame column(s)general_clean_entries
Mutate operation(s) in Scottish post office general directory data.frame column(s)general_fix_structure
Mutate operation(s) in Scottish post office general directory data.frame column(s)general_move_house_to_address
Mutate operation(s) in Scottish post office general directory data.frame column(s)general_repatriate_occupation_from_address
Mutate operation(s) in Scottish post office general directory data.frame column(s)general_split_address_numbers_bodies
Mutate operation(s) in Scottish post office general directory data.frame column(s)general_split_trade_addresses
Mutate operation(s) in Scottish post office general directory data.frame column(s)general_split_trade_house_addresses
Place names in address entriesglobals_address_names
Ampersand in directory entriesglobals_ampersand
Ampersand in directory entriesglobals_ampersand_vector
Ampersand in directory entriesglobals_and_double_quote
Ampersand in directory entriesglobals_and_single_quote
Forenames in directory recordsglobals_forenames
General directory column namesglobals_general_colnames
"Mac" pre-fixes in name entriesglobals_macs
Numbers in address entriesglobals_numbers
Occupations in directory recordsglobals_occupations
Place types in address entriesglobals_places_raw
Place types in address entriesglobals_places_regex
Regular expression for mutate operations in directory datasetsglobals_regex_address_house_body_number
Regular expression for mutate operations in directory datasetsglobals_regex_address_prefix
Regular expression for mutate operations in directory datasetsglobals_regex_and_filter
Regular expression for mutate operations in directory datasetsglobals_regex_and_match
Regular expression for mutate operations in directory datasetsglobals_regex_get_address_house_type
Regular expression for mutate operations in directory datasetsglobals_regex_house_split_trade
Regular expression for mutate operations in directory datasetsglobals_regex_house_to_address
Regular expression for mutate operations in directory datasetsglobals_regex_irrelevants
Regular expression for mutate operations in directory datasetsglobals_regex_occupation_from_address
Regular expression for mutate operations in directory datasetsglobals_regex_split_address_body
Regular expression for mutate operations in directory datasetsglobals_regex_split_address_empty
Regular expression for mutate operations in directory datasetsglobals_regex_split_address_numbers
Regular expression for mutate operations in directory datasetsglobals_regex_split_trade_addresses
Regular expression for mutate operations in directory datasetsglobals_regex_titles
Saints in address namesglobals_saints
Address suffixesglobals_suffixes
Surnames in directory recordsglobals_surnames
Titles in directory name recordsglobals_titles
Trades directory column namesglobals_trades_colnames
Combined directories column namesglobals_union_colnames
Worksites in address entriesglobals_worksites
Mutate operation(s) in Scottish post office trades directory data.frame column(s)trades_clean_directory
Mutate operation(s) in Scottish post office trades directory data.frame column(s)trades_clean_directory_plain
Mutate operation(s) in Scottish post office trades directory data.frame column(s)trades_clean_directory_progress
Mutate operation(s) in Scottish post office trades directory data.frame column(s)trades_clean_entries
Clean directory address entriesutils_clean_address
Clean address(es) bodyutils_clean_address_body
Clean address entry endsutils_clean_address_ends
Clean address(es) numberutils_clean_address_number
Clean directory addressesutils_clean_addresses
Clean entry endsutils_clean_ends
Clean entries name recordsutils_clean_names
Clean entries occupation recordutils_clean_occupations
Clear string of matched contentutils_clear_content
Mutate operation(s) in directory dataframe column(s)utils_clear_irrelevants
Execute functionutils_execute
Format raw directory for further processingutils_format_directory_raw
Conditionally amend character string vector.utils_gsub_if_found
Load object into memoryutils_IO_load
Make path for input/output operationsutils_IO_path
Write object to long term memoryutils_IO_write
Check is address entry not missingutils_is_address_missing
Label addresses if missingutils_label_address_if_missing
Label empty addresses as missingutils_label_missing_addresses
Load directory "csv" file(s) into memoryutils_load_directories_csv
Make file nameutils_make_file
Make destination pathutils_make_path
Mutate operation(s) in dataframe column(s)utils_mutate_across
Mute a function call executionutils_mute
Conditionally amend character string vector.utils_paste_if_found
Conditionally amend character string vector.utils_regmatches_if_found
Conditionally amend character string vector.utils_regmatches_if_not_empty
Clear undesired address prefixesutils_remove_address_prefix
Split string into tibbleutils_split_and_name
Clear extra white spaces in dataframeutils_squish_all_columns