Title: | Implementation of Case-Control Optimal Matching |
---|---|
Description: | Cases are matched to controls in an efficient, optimal and computationally flexible way. It uses the idea of sub-sampling in the level of the case, by creating pseudo-observations of controls. The user can select between replacement and without replacement, the number of controls, and several covariates to match upon. See Mamouris (2021) <doi:10.1186/s12874-021-01256-3> for an overview. |
Authors: | Pavlos Mamouris [aut, cre], Vahid Nassiri [aut, ctb] |
Maintainer: | Pavlos Mamouris <[email protected]> |
License: | GPL-2 |
Version: | 0.1.0 |
Built: | 2024-11-20 06:40:26 UTC |
Source: | CRAN |
A dataset containing cases and controls using the Intego registry data. The variables are as follows:
data(being_processed)
data(being_processed)
A data frame with 77110 rows and 11 variables
cluster_case: each case forms a cluster with all poosible controls to be matched
Patient_Id: Unique identifier for each patient
case_control: binary, if case==Colorectal Cancer, else control
case_ind: binary, if 1==case, else control
JCG: Year of Contact
entry_year: the year that the patient first entrered the database
CI: Comorbidity Index. Count of chronic diseases before index data
age_diff: difference of age between cases and controls
fup_diff: difference of follow-up between cases and controls
total_control_per_case: total controls that are available to be pooled per case
freq_of_controls: how many times the control is available to be matched for different cases
Fast and optimal matching for cases and controls
Maintainer: Pavlos Mamouris [email protected]
Authors:
Vahid Nassiri [email protected] [contributor]
A dataset containing cases and controls using the Intego registry data. But not the final dataset. The variables are as follows:
data(not_processed)
data(not_processed)
A data frame with 656506 rows and 9 variables
Patient_Id: Unique identifier for each patient
JCG: Year of Contact
Birth_Year: Patient's year of birth
Gender: Patient's Gender
Practice_Id: Patient's general practice
case_control: binary, if case==Colorectal Cancer, else control
entry_year: the year that the patient first entrered the database
fup_diff: difference of follow-up between cases and controls
CI: Comorbidity Index. Count of chronic diseases before index data
optimal_matching is performing the optimal match between cases and controls in an iterative way and computational efficient way
optimal_matching( total_database, n_con, cluster_var, Id_Patient, total_cont_per_case, case_control, with_replacement = FALSE )
optimal_matching( total_database, n_con, cluster_var, Id_Patient, total_cont_per_case, case_control, with_replacement = FALSE )
total_database |
a data frame that contains the cases and controls |
n_con |
number of controls to be matched |
cluster_var |
a variable that contains one case with all available controls to be pooled |
Id_Patient |
Id of the patient |
total_cont_per_case |
total number of controls that are available for each case |
case_control |
a variable containing "case" and "control" |
with_replacement |
Use replacement or not |
Here is where I should put all my details. This is where I should give more examples if necessary
a data frame containing the cases and the corresponding number of controls
optimal_matching(being_processed, n_con=2, cluster_var=cluster_case, Id_Patient=Patient_Id, total_cont_per_case=total_control_per_case, case_control = case_control)
optimal_matching(being_processed, n_con=2, cluster_var=cluster_case, Id_Patient=Patient_Id, total_cont_per_case=total_control_per_case, case_control = case_control)