Package 'rethnicity'

Title: Predicting Ethnic Group from Names
Description: Implementation of the race/ethnicity prediction method, described in "rethnicity: An R package for predicting ethnicity from names" by Fangzhou Xie (2022) <doi:10.1016/j.softx.2021.100965> and "Rethnicity: Predicting Ethnicity from Names" by Fangzhou Xie (2021) <doi:10.48550/arXiv.2109.09228>.
Authors: Fangzhou Xie [aut, cre]
Maintainer: Fangzhou Xie <[email protected]>
License: CC BY-NC-SA 4.0
Version: 0.2.6
Built: 2024-10-19 03:28:48 UTC
Source: CRAN

Help Index


Predict ethnicity from names.

Description

Predict ethnicity either by last names or both first and last names. This is the default and recommended method for prediction.

Usage

predict_ethnicity(
  firstnames = NULL,
  lastnames = NULL,
  method = "fullname",
  threads = 0,
  na.rm = FALSE
)

Arguments

firstnames

A character vector of first names. Default to NULL. Only use this if you are using 'method' = 'fullname'.

lastnames

A character vector of last names. Default to NULL. Use this in both 'fullname' and 'lastname' methods.

method

"fullname" or "lastname". Inference method to choose from.

threads

single integer. Number of threads to use for multi-threading.

na.rm

TRUE or FALSE (bool). If TRUE, then the NAs will be removed; if FALSE, then return error if there is NA in the arguments.

Value

data.frame with probability of being each ethnic group and the predicted group (one with highest probability)

Examples

predict_ethnicity(firstnames = "Alan", lastnames = "Turing")

Predict ethnicity from full name

Description

Predicts ethnicity from first names and last names, using self-trained model with customized labels. This is designed for advanced users who wish to use their own models. For most use cases, use predict_ethnicity() for prediction.

Usage

predict_fullname(
  firstnames,
  lastnames,
  na.rm = FALSE,
  threads = 0L,
  labels = NULL,
  model_path = NULL
)

Arguments

firstnames

character vector, first names

lastnames

character vector, last names

na.rm

bool, default to FALSE, whether to remove the na in the lastnames

threads

int, number of threads for multi-threading

labels

character vector, labels of the classification model, needs to be in the same order as the trained model

model_path

character file path, the path to the trained model in .json format (converted from Keras by frugally-deep)

Value

data.frame with predicted probability and predicted ethnicity


Predict ethnicity from last name

Description

Predicts ethnicity from last names, using self-trained model with customized labels. This is designed for advanced users who wish to use their own models. For most use cases, use predict_ethnicity() for prediction.

Usage

predict_lastname(
  lastnames,
  na.rm = FALSE,
  threads = 0L,
  labels = NULL,
  model_path = NULL
)

Arguments

lastnames

character vector, last names

na.rm

bool, default to FALSE, whether to remove the na in the lastnames

threads

int, number of threads for multi-threading

labels

character vector, labels of the classification model, needs to be in the same order as the trained model

model_path

character file path, the path to the trained model in .json format (converted from Keras by frugally-deep)

Value

data.frame with predicted probability and predicted ethnicity