LARisk: An R package for Lifetime Attributable Risk Calculation

Introduction

The R package, LARisk, to compute lifetime attributable risk (LAR) of radiation-induced cancer can be helpful with enhancement of the flexibility in research of projected risks of radiation-associated cancers. LARisk produces LAR estimates considering various options or arguments. In addition, it is possible to handle large-size data easily and compute LAR values by the group such as occupation, sex, age, group, etc., which can provide research topics for radiation-associated cancer risk.


This document provides a detailed description of the LARisk package with some examples. If the package is installed, then we can load it into an R session by

library(LARisk)



Arguments of the LAR function

The LARisk package has 3 main functions for estimating lifetime attributable risk such as LAR, LAR_batch and LAR_group. LAR is a basic function to compute individual LAR values. And the others are extended functions to handle large batch data and calculate LAR estimates by group. The description of each function is in Functions for estimating LAR.

LAR(data, basedata, sim=300, seed=99, current=as.numeric(substr(Sys.Date(),1,4)),
    ci=0.9, weight=NULL, DDREF=TRUE, basepy=1e+05)

The following table shows the arguments of the LAR function.

Arguments Description
data A data frame containing demographic and exposure information
basedata A list of data of lifetime and incidence rate tables
sim A scalar for the number of iteration
seed A scalar for a random seed number
current A scalar for a current year
ci A scalar for confidence level to compute confidence intervals for LAR estimates
weight A list containing values on [0,1] to compute LAR values based on ERR and EAR models for each cancer site
DDREF Logical. Whether apply the dose and dose-rate effectiveness factor for chronic exposure
basepy A scalar for the number of base person-years

data

The data should have some prerequisite information such as sex and birth year(s) (birth), exposure year (exposure), exposed dose distributions (dosedist), fixed exposed radiation dose or parameters of dose distributions (dose1, dose2, dose3), sites where exposed (site), and exposure rate (exposure_rate). The name of variables in data should be written as expressed.

The following table expresses the essential variables of the argument, data.

Variables Format
sex one of the character strings ‘male’ or ‘female
birth numeric
expposure numeric
site one of the chracter strings ‘stomach’, ‘colon’, ‘liver’, ‘lung’, ‘breast’, ‘ovary’, ‘uterus’, ‘prostate’, ‘bladder’, ‘brain/cns’, ‘thyroid’, ‘remainder’, ‘oral’, ‘oesophagus’, ‘rectum’, ‘gallbladder’, ‘pancreas’, ‘kidney’, ‘leukemia’.
exposure_rate one of the character strings ‘chronic’ or ‘acute
dosedist one of the character strings ‘fixedvalue’, ‘lognormal’, ‘normal’, ‘triangular’, ‘logtriangular’, ‘uniform’, ‘loguniform
dose1 numeric
dose2 numeric
dose3 numeric

Because LAR is the function for each object, it is logically trivial that all sex and birth are same. Also, since the event dates of exposure must occur after the birth date, exposure should be larger than birth.


ex_data <- data.frame(sex = 'male', birth = 1900, exposure = 1980,
                  site = 'stomach', exposure_rate = "chronic",
                  dosedist = 'fixedvalue', dose1 = 10, dose2=NA, dose3=NA)

LAR(ex_data, basedata=list(life2010, incid2010)) ## error
#> Error in check_data(data, current): Age is not allowed to be greater than 100 years.

The maximum age in the function is set as 100 years old. If the data contains a birth year which makes attained age over 100, it occurs error.


For site, we put the irradiated organ site or cancer-site. LAR estimates excess cases with the site as ‘stomach’, ‘colon’, ‘liver’, ‘lung’, ‘breast’, ‘ovary’, ‘uterus’, ‘prostate’, ‘bladder’, ‘brain/cns’, ‘thyroid’, ‘remainder’, ‘oral’, ‘oesophagus’, ‘rectum’, ‘gallbladder’, ‘pancreas’, ‘kidney’, ‘leukemia’. In particular, site that are applicable in LAR differ by gender(sex). For male, ‘breast’, ‘ovary’ and ‘uterus’ are not allowed. Similarly, for female, ‘prostate’ is not allowed.


In dosedist, we insert the distribution of the exposed dose. It can have ‘fixedvalue’, ‘lognormal’, ‘normal’, ‘triangular’, ‘logtriangular’, ‘uniform’ or ‘loguniform’. Each distribution demands essential parameters. For instance, if the exposed dose has a normal distribution with the mean of 2.3 and the standard deviation of 0.8, we input dose1=2.3, dose2=0.8 and dose3=NA. If the dose has the fixed value of 3.2, we add values asdose1=3.2, dose2=NA and dose3=NA.

dose distribution dose1 dose2 dose3
fixedvalue value NA NA
lognormal median geometric standard deviation NA
normal mean standard deviation NA
triangular minimum mode maximum
logtriangular minimum mode maximum
uniform minimum maximum NA
loguniform minimum maximum NA


basedata

The LAR and the other extended functions need lifetime and cancer incidence rate tables. We put these tables to the argument ‘basedata’ in which the first element is lifetime table and the second element is cancer incidence rate table.

LAR(data,
    basedata = list("the first is lifetime table", "the second is cancer incidence rate table"))

LARisk includes these tables which were made in 2010 and 2018 in Korea: life2010, incid2010, life2018 and incid2018. Thus we can estimate the risk for the Korean population in 2010 or 2018 using these tables.

If we want to estimate the risks of the other population, we’ll need the lifetime and cancer incidence rate tables of the population. Similar to data, lifetime and cancer incidence rate tables must follow the specified format.

head(life2010)      ## lifetime table of the Korean in 2010.
#>   Age Prob_d_m Prob_d_f
#> 1   0  0.00369  0.00275
#> 2   1  0.00032  0.00030
#> 3   2  0.00025  0.00022
#> 4   3  0.00018  0.00015
#> 5   4  0.00015  0.00011
#> 6   5  0.00013  0.00009

The columns of a lifetime table are consist of ‘Age’, ‘Prob_d_m’, and ‘Prob_d_f’. Prob_d_m and Prob_d_f are the probabilities of death of male and female, respectively.

head(incid2010)     ## cancer incidence rate table of the Korean in 2010.
#>   Site Age Rate_m Rate_f
#> 1 oral   0    0.2    0.1
#> 2 oral   1    0.2    0.1
#> 3 oral   2    0.2    0.1
#> 4 oral   3    0.2    0.1
#> 5 oral   4    0.2    0.1
#> 6 oral   5    0.2    0.2

Also, the columns of a cancer incidence rate table consist of ‘Site’, ‘Age’, ‘Rate_m’, and ‘Rate_f’. Rate_m and Rate_f are incidence rates of each cancer site of male and female, respectively. The tables should have the range of age from 0 to 100 one by one.


weight

weight is used to estimate LAR through the weighted average of LAR estimates based on ERR and EAR models. It has the form of list whose name of elements is site to decide organ and values of them is for a specific value of the weight. For example, if a weight of stomach cancer is 0.5, run the below code.

LAR(data, basedata, weight=list(stomach = 0.5))

LAR sets the default weight to 0.7 in most cancers. However, in lung cancer, the weight is 0.3, and cancers of breast and thyroid only have weights of 1 for LAR functions based on EAR or ERR models, respectively (see below table).

Cancer site LAR_ERR LAR_EAR weight
Most cancer 70% 30% 0.7
Lung 30% 70% 0.3
Breast 0% 100% 0.0
Thyroid 100% 0% 1.0
Gallbladder 100% 0% 1.0
Brain/CNS 100% 0% 1.0

DDREF

DDREF (dose and dose-rate effectiveness factor) is the logical option to select whether or not to consider DDREF in the LAR calculation. DDREF is to modify the effect of exposure, especially, for low-dose exposure. In addition, DDREF is considered differently according to exposure rate. However, if the site is leukemia, DDREF dose not apply even if DDREF = TRUE.

ex_data <- data.frame(sex = 'male', birth = 1990, exposure = 2015,
                  site = 'leukemia', exposure_rate = "chronic",
                  dosedist = 'fixedvalue', dose1 = 10, dose2=NA, dose3=NA)

LAR(ex_data, basedata=list(life2010, incid2010), DDREF=TRUE)
#> LAR: 
#>   Lower    Mean   Upper 
#>  2.6587  5.9803 13.4514 
#> 
#> Future LAR: 
#>          Lower     Mean    Upper
#> F.LAR   2.4456   5.4806  12.2823
#> BFR   530.4766 530.4766 530.4766
#> TFR   532.9221 535.9572 542.7589
#> ---
LAR(ex_data, basedata=list(life2010, incid2010), DDREF=FALSE) ## the result are same
#> LAR: 
#>   Lower    Mean   Upper 
#>  2.6587  5.9803 13.4514 
#> 
#> Future LAR: 
#>          Lower     Mean    Upper
#> F.LAR   2.4456   5.4806  12.2823
#> BFR   530.4766 530.4766 530.4766
#> TFR   532.9221 535.9572 542.7589
#> ---

other arguments

seed is the random seed number. As long as the same seed number is provided, we obtain the same result in anytime. sim is the number of simulation runs. Note that as sim goes larger, the computation time takes longer although the simulation variation is getting smaller. i.e., even though seed is different, the large sim yields a similar outcome. In LARisk, sim=300 is default. basepy is the baseline person year such as 10,000 person year or 100,000 person year.

LAR(data, basedata, seed=1111)    ## changing seed number, the result is also changed
LAR(data, basedata, sim=1000)     ## the large 'sim' offers a stable simulation result
LAR(data, basedata, basepy=1e+03) ## setting the baseline person-year is 1000

current is the year to set as the moment of estimation. The default value is set as the system time of the computer. Since it is considered as the current year, we can change the option if we want to set the current time into other years. It recommends that the value should be in form of a year in 4 digits.

LAR(data, basedata, current=2019) ## setting the current year is 2019

Changing the current time affects the estimation of future lifetime attributable risk and future baseline risk.


ci is the level of significance to provide the confidence interval of LAR estimates, expressed in number between 0 and 1. The default value is 0.9, in other words, the LAR function provides the confidence interval at 90% level of significance in default setting.

LAR(data, basedata, ci=0.8) ## setting the confidence level is 0.8



Functions for estimating LAR

As mentioned above, the package LARisk includes 3 main functions LAR, LAR_batch, and LAR_group that estimate the LAR values for various cases. These functions can be used for a variety of purposes by users. The functions give the three kinds of estimates such as lifetime risk, future risk and lifetime baseline risk. LAR and F_LAR are represented as LAR and future LAR estimates with confidence limits (lower and upper) for each cancer site, solid cancer and total.

We will use the toy example data ‘nuclear’ in this section, which is simulated with the assumption that all people are exposed to radiation at the same time (Details on this data are in “APPENDIX: Datasets in LARisk”).

LAR: the function of estimating LAR for one person

LAR is the function to estimate LAR for one person. It returns an object of class LAR. LAR class contains the risks of the person, information of the person (gender and birth-year), and some options for calculating risks. The following is the table of components in the LAR object.

Values Description
LAR Lifetime attributable risk (LAR) from the time of exposure to the end of the expected lifetime
F_LAR Future attributable risk from current to the expected lifetime
LBR Lifetime baseline risk
BFR Baseline future risk
LFR Lifetime fractional risk
TFR Total future risk
current Current year
ci Confidence level
pinfo Information of the person
nuclear1 <- nuclear[nuclear$ID=="ID01",]

print(nuclear1)
#>     ID    sex birth exposure       site exposure_rate   dosedist    dose1 dose2
#> 1 ID01 female  1973     2011      ovary         acute fixedvalue 50.06989    NA
#> 2 ID01 female  1973     2011 oesophagus         acute fixedvalue 50.37462    NA
#> 3 ID01 female  1973     2011    bladder         acute fixedvalue 52.46040    NA
#> 4 ID01 female  1973     2011       lung         acute fixedvalue 55.69177    NA
#> 5 ID01 female  1973     2011  remainder         acute fixedvalue 51.64678    NA
#> 6 ID01 female  1973     2011     rectum         acute fixedvalue 49.37011    NA
#> 7 ID01 female  1973     2011    thyroid         acute fixedvalue 54.14875    NA
#>   dose3 distance
#> 1    NA        1
#> 2    NA        1
#> 3    NA        1
#> 4    NA        1
#> 5    NA        1
#> 6    NA        1
#> 7    NA        1

LAR(nuclear1, basedata = list(life2010, incid2010))
#> LAR: 
#>     Lower      Mean     Upper 
#>  359.9479  671.3751 1203.8111 
#> 
#> Future LAR: 
#>            Lower       Mean      Upper
#> F.LAR   321.1756   547.7289   926.4876
#> BFR   15038.6045 15038.6045 15038.6045
#> TFR   15359.7802 15586.3335 15965.0921
#> ---

The LAR object prints the total LAR , total future LAR, total baseline future risk, and total future risk. If you want the more detailed results, you can use the summary function.

summary(LAR(nuclear1, basedata = list(life2010, incid2010)))
#> Information: 
#>     sex birth
#>  female  1973
#> 
#> LAR: 
#>               Lower     Mean     Upper        LBR    LFR
#> lung        74.5065 156.5247  241.1717  3630.3464 0.0431
#> ovary        4.4880  13.3756   27.6491   697.6095 0.0192
#> bladder     13.0799  30.4405   58.7444   452.0590 0.0673
#> thyroid     83.9986 368.7603  877.5446  7318.8901 0.0504
#> remainder   33.5177  89.1794  169.3327  4237.3935 0.0210
#> oesophagus   0.2000   4.0450   10.4666   116.7920 0.0346
#> rectum       0.2670   9.0496   21.7567  2157.6294 0.0042
#> leukemia     0.0000   0.0000    0.0000     0.0000    NaN
#> solid      359.9479 671.3751 1203.8111 18610.7199 0.0361
#> total      359.9479 671.3751 1203.8111 18610.7199 0.0361
#> 
#> Future LAR: 
#>               Lower     Mean    Upper        BFR        TFR
#> lung        72.5943 152.5667 235.0454  3529.8402  3682.4069
#> ovary        3.8388  11.2291  24.1188   557.0741   568.3032
#> bladder     12.9023  30.0419  57.9653   450.8342   480.8761
#> thyroid     58.8575 259.8668 638.9150  4383.3358  4643.2027
#> remainder   31.1815  81.8748 154.7518  3987.9538  4069.8286
#> oesophagus   0.1963   3.8641  10.0477   115.1383   119.0024
#> rectum       0.2484   8.2855  19.8152  2014.4280  2022.7136
#> leukemia     0.0000   0.0000   0.0000     0.0000     0.0000
#> solid      321.1756 547.7289 926.4876 15038.6045 15586.3335
#> total      321.1756 547.7289 926.4876 15038.6045 15586.3335
#> 
#> Confidence Level: 0.9
#> Current Year: 2025
#> ---

The suumary function provides the person’s gender and year of birth, risks by cancer type, confidence levels, and current year. In summary results, the LAR tab includes site-specific LAR, lifetime baseline risk (LBR), and lifetime fractional risk (LFR). Also, the Future LAR tab contains site-specific future LAR, baseline future risk (BFR), and total future risk (TFR).


LAR_batch: the function of estimating LAR for several people

If you want to consider more than one person, you can use LAR'. But, for large observations, theLAR_batchfunction is useful. UnlikeLAR`, it calculates each persons’ risks after reading multiple people’s data at once.

Since data contains more than one person, the function requires an argument to distinguish each person. pid is the argument, which is a vector to distinguish each person in the dataset. For example, suppose that we want to calculate LAR estimates of several people in the nuclear dataset. Since the variable “ID” is the person ID for this data, we can estimate the LAR values as follows.

ex_batch <- LAR_batch(nuclear, pid=nuclear$ID, basedata = list(life2010, incid2010))

class(ex_batch)
#> [1] "LAR_batch" "LAR"

class(ex_batch[[1]])
#> [1] "LAR"

The LAR_batch returns the LAR_batch class object. It is the form of the list of LAR class objects which names of elements are IDs for people, i.e., each element of LAR_batch class is LAR class object. Thus, printing the results of LAR_batch is similar to LAR.

print(ex_batch, max.id=3)
#> LAR result of ID01 
#> 
#> LAR: 
#>     Lower      Mean     Upper 
#>  359.9479  671.3751 1203.8111 
#> 
#> Future LAR: 
#>            Lower       Mean      Upper
#> F.LAR   321.1756   547.7289   926.4876
#> BFR   15038.6045 15038.6045 15038.6045
#> TFR   15359.7802 15586.3335 15965.0921
#> ---
#> 
#> LAR result of ID02 
#> 
#> LAR: 
#>    Lower     Mean    Upper 
#> 308.6326 532.0084 758.9928 
#> 
#> Future LAR: 
#>            Lower       Mean      Upper
#> F.LAR   290.4604   508.7879   714.8052
#> BFR   24619.7282 24619.7282 24619.7282
#> TFR   24910.1886 25128.5161 25334.5334
#> ---
#> 
#> LAR result of ID03 
#> 
#> LAR: 
#>     Lower      Mean     Upper 
#>  791.5664 1241.2319 1734.6870 
#> 
#> Future LAR: 
#>            Lower      Mean     Upper
#> F.LAR   762.0682  1201.938  1672.642
#> BFR   20323.9982 20323.998 20323.998
#> TFR   21086.0664 21525.937 21996.640
#> ---
#> 
#> The results for 17 people are omitted.

If you want the minimum results, we can use the print. It also runs by default when simply calling the LAR_batch class object. Using the max.id option, you can control the maximum number of printing results (default is 50).


Similarly, using the summary, you can get more detailed results. The result of the function is the same as listing the summary of each person.

summary(ex_batch, max.id=3)
#> summaries of LAR result :  ID01 
#> 
#> Information: 
#>     sex birth
#>  female  1973
#> 
#> LAR: 
#>               Lower     Mean     Upper        LBR    LFR
#> lung        74.5065 156.5247  241.1717  3630.3464 0.0431
#> ovary        4.4880  13.3756   27.6491   697.6095 0.0192
#> bladder     13.0799  30.4405   58.7444   452.0590 0.0673
#> thyroid     83.9986 368.7603  877.5446  7318.8901 0.0504
#> remainder   33.5177  89.1794  169.3327  4237.3935 0.0210
#> oesophagus   0.2000   4.0450   10.4666   116.7920 0.0346
#> rectum       0.2670   9.0496   21.7567  2157.6294 0.0042
#> leukemia     0.0000   0.0000    0.0000     0.0000    NaN
#> solid      359.9479 671.3751 1203.8111 18610.7199 0.0361
#> total      359.9479 671.3751 1203.8111 18610.7199 0.0361
#> 
#> Future LAR: 
#>               Lower     Mean    Upper        BFR        TFR
#> lung        72.5943 152.5667 235.0454  3529.8402  3682.4069
#> ovary        3.8388  11.2291  24.1188   557.0741   568.3032
#> bladder     12.9023  30.0419  57.9653   450.8342   480.8761
#> thyroid     58.8575 259.8668 638.9150  4383.3358  4643.2027
#> remainder   31.1815  81.8748 154.7518  3987.9538  4069.8286
#> oesophagus   0.1963   3.8641  10.0477   115.1383   119.0024
#> rectum       0.2484   8.2855  19.8152  2014.4280  2022.7136
#> leukemia     0.0000   0.0000   0.0000     0.0000     0.0000
#> solid      321.1756 547.7289 926.4876 15038.6045 15586.3335
#> total      321.1756 547.7289 926.4876 15038.6045 15586.3335
#> 
#> Confidence Level: 0.9
#> Current Year: 2025
#> ---
#> 
#> summaries of LAR result :  ID02 
#> 
#> Information: 
#>   sex birth
#>  male  1981
#> 
#> LAR: 
#>                Lower     Mean    Upper        LBR     LFR
#> colon        91.2117 191.6816 305.6734  4478.9565  0.0428
#> lung         74.4763 168.8631 279.8595  9283.1503  0.0182
#> prostate    -86.4482  25.0355 145.8765  5267.9372  0.0048
#> thyroid      22.3790  96.6625 222.4750  1752.7121  0.0552
#> oral          3.5526  17.3279  37.6939   910.2072  0.0190
#> gallbladder -47.0215  -3.5972  35.1159  1660.5321 -0.0022
#> pancreas      7.0824  36.0351  75.1721  1444.6747  0.0249
#> leukemia      0.0000   0.0000   0.0000     0.0000     NaN
#> solid       308.6326 532.0084 758.9928 24798.1701  0.0215
#> total       308.6326 532.0084 758.9928 24798.1701  0.0215
#> 
#> Future LAR: 
#>                Lower     Mean    Upper        BFR        TFR
#> colon        89.5358 187.5921 299.5779  4486.1536  4673.7458
#> lung         74.5921 169.2339 280.5659  9421.9911  9591.2250
#> prostate    -87.7125  25.3151 147.5182  5375.1836  5400.4987
#> thyroid      17.7564  78.4418 176.4609  1298.3212  1376.7630
#> oral          3.2549  16.1630  35.3841   895.4369   911.5999
#> gallbladder -47.1338  -3.6042  35.1967  1683.8350  1680.2308
#> pancreas      6.9058  35.6461  74.5706  1458.8067  1494.4529
#> leukemia      0.0000   0.0000   0.0000     0.0000     0.0000
#> solid       290.4604 508.7879 714.8052 24619.7282 25128.5161
#> total       290.4604 508.7879 714.8052 24619.7282 25128.5161
#> 
#> Confidence Level: 0.9
#> Current Year: 2025
#> ---
#> 
#> summaries of LAR result :  ID03 
#> 
#> Information: 
#>   sex birth
#>  male  1988
#> 
#> LAR: 
#>               Lower      Mean     Upper       LBR    LFR
#> stomach    282.2981  484.3981  719.4266 10050.485 0.0482
#> prostate  -209.3797   60.6094  306.9072  5240.937 0.0116
#> remainder  408.0930  696.2245 1095.6493  5028.838 0.1384
#> leukemia     0.0000    0.0000    0.0000     0.000    NaN
#> solid      791.5664 1241.2319 1734.6870 20320.261 0.0611
#> total      791.5664 1241.2319 1734.6870 20320.261 0.0611
#> 
#> Future LAR: 
#>               Lower      Mean     Upper       BFR       TFR
#> stomach    277.4963  477.3002  708.9757 10096.193 10573.494
#> prostate  -211.9092   60.9461  309.3497  5305.135  5366.081
#> remainder  387.4366  663.6922 1042.7133  4922.670  5586.362
#> leukemia     0.0000    0.0000    0.0000     0.000     0.000
#> solid      762.0682 1201.9384 1672.6416 20323.998 21525.937
#> total      762.0682 1201.9384 1672.6416 20323.998 21525.937
#> 
#> Confidence Level: 0.9
#> Current Year: 2025
#> ---
#> 
#> The results for 17 people are omitted.


### LAR_group: the function of averaging estimated LAR by group The function LAR_group is averaging the calculated risks according to groups. It offers grouped LAR, grouped future LAR, and grouped baseline risk values based on values of simulation for each person. It provides each LAR value for each group, which makes new LAR values, and then these new LAR values are taken to present summarized LAR values for each group.

This function requires not only the value distinguishing the person but also the value for the group. group is the vector or list that groups the data. The function returns the LAR_group class object which is the form of a list of LAR class objects.


Suppose that we want to estimate the average LAR of the people in the nuclear dataset by the distance. Then we can put group=nuclear$distnace in LAR_group.

ex_group1 <- LAR_group(nuclear, pid = nuclear$ID, group = nuclear$distance,
                       basedata = list(life2010, incid2010))
summary(ex_group1)
#> summaries of LAR result : Group 1 
#> 
#> Group Information: 
#>     sex count    birth
#>  female    35 1962.600
#>    male    45 1962.222
#> 
#> LAR: 
#>                Lower     Mean    Upper        LBR     LFR
#> stomach      50.9850  66.2373  87.8936  2926.1714  0.0226
#> colon        43.4301  56.1274  72.3353  2160.0832  0.0260
#> liver         7.0657  10.4933  15.2987   808.9772  0.0130
#> lung         39.1538  53.4791  70.6998  3097.9930  0.0173
#> breast        2.8636   4.2821   6.0646   331.2576  0.0129
#> ovary         0.3206   0.9554   1.9749    49.8292  0.0192
#> uterus        0.0216   0.1491   0.3288    68.4665  0.0022
#> prostate    -15.4917   6.2743  27.6037   960.8121  0.0065
#> bladder      10.5122  16.1598  23.8353   430.7553  0.0375
#> brain/cns     0.7414   1.3918   2.2998    46.3592  0.0300
#> thyroid      67.7618 185.2396 350.2187  1648.0018  0.1124
#> remainder    34.5468  56.8793  88.6948   912.3992  0.0623
#> oral          0.3038   1.3365   2.7663   105.2675  0.0127
#> oesophagus    2.7255   5.3313   8.1798   208.5186  0.0256
#> rectum        0.3667   1.3777   2.5946   618.5623  0.0022
#> gallbladder  -6.2708  -0.9336   4.2742   420.9955 -0.0022
#> pancreas      3.4377   7.4064  11.9408   493.6795  0.0150
#> kidney        1.0530   3.2226   6.5984   203.1165  0.0159
#> leukemia      0.0713   0.2012   0.5677    19.9927  0.0101
#> solid       339.2210 475.4092 659.3305 15491.2456  0.0307
#> total       339.4012 475.6104 659.5318 15511.2384  0.0307
#> 
#> Future LAR: 
#>                Lower     Mean    Upper        BFR        TFR
#> stomach      46.4503  61.4748  82.4859  2369.1288  2430.6036
#> colon        39.1060  51.8882  68.4563  1827.8028  1879.6909
#> liver         5.4983   8.2791  12.1746   509.4548   517.7339
#> lung         37.7782  51.8490  69.3206  2815.4823  2867.3312
#> breast        2.0931   3.1704   4.5034   124.3365   127.5069
#> ovary         0.2742   0.8021   1.7228    39.7910    40.5931
#> uterus        0.0185   0.1124   0.2367    35.1988    35.3112
#> prostate    -15.6731   6.2053  27.6449   864.8368   871.0422
#> bladder      10.2254  15.9165  23.6161   378.5707   394.4872
#> brain/cns     0.5385   0.9752   1.5952    32.5980    33.5732
#> thyroid      54.0296 159.2511 307.1434  1101.5100  1260.7611
#> remainder    32.8132  53.7157  83.0856   807.5121   861.2278
#> oral          0.2438   1.2110   2.5980    83.9342    85.1452
#> oesophagus    2.2481   4.0510   5.9571   151.9243   155.9753
#> rectum        0.3149   1.2371   2.3627   498.7383   499.9754
#> gallbladder  -5.8033  -0.8520   4.0685   348.0588   347.2068
#> pancreas      3.1216   6.9651  11.3113   437.1109   444.0761
#> kidney        0.9314   3.0793   6.4746   178.0941   181.1734
#> leukemia      0.0702   0.1217   0.2107    11.6943    11.8160
#> solid       305.7734 429.3313 595.1314 12604.0832 13033.4144
#> total       305.8951 429.4529 595.2530 12615.7775 13045.2304
#> 
#> Confidence Level: 0.9
#> Current Year: 2025
#> ---
#> 
#> summaries of LAR result : Group 2 
#> 
#> Group Information: 
#>     sex count    birth
#>  female    12 1987.417
#>    male     5 1956.000
#> 
#> LAR: 
#>                 Lower      Mean     Upper       LBR     LFR
#> colon        372.7596  534.3113  750.7850 2319.0697  0.2304
#> liver         33.3218   57.2681   87.5048 1184.1939  0.0484
#> lung         846.3382 1072.5057 1355.1944  903.6865  1.1868
#> breast       222.3765  302.6915  391.8901 1117.4866  0.2709
#> ovary          9.4594   24.1729   47.0888  183.7846  0.1315
#> bladder      142.6668  242.1335  363.4800  559.0715  0.4331
#> brain/cns     21.8762   44.8910   77.9991  154.8744  0.2899
#> thyroid        6.3675   20.9331   49.7293  208.9864  0.1002
#> remainder     70.0662  148.2741  249.0938  996.9884  0.1487
#> oral           4.7167   12.1716   21.1804   85.6799  0.1421
#> rectum        -0.0033   13.9800   30.7953  822.0275  0.0170
#> gallbladder  -30.2879   -3.1930   21.6224  392.5105 -0.0081
#> kidney        13.8521   63.7456  115.0254  108.8816  0.5855
#> leukemia       0.0000    0.0000    0.0000    0.0000     NaN
#> solid       2151.0610 2533.8854 2926.4130 9037.2418  0.2804
#> total       2151.0610 2533.8854 2926.4130 9037.2418  0.2804
#> 
#> Future LAR: 
#>                 Lower      Mean     Upper       BFR        TFR
#> colon        358.5421  518.9878  736.8814 2168.8941  2687.8819
#> liver         26.5901   44.4973   66.4874  803.4050   847.9023
#> lung         846.5952 1072.7409 1355.7720  904.7881  1977.5290
#> breast       212.8543  288.8203  372.4440  958.3232  1247.1435
#> ovary          8.8224   22.1431   43.7064  167.8291   189.9721
#> bladder      138.4590  236.1825  359.2219  515.6799   751.8624
#> brain/cns     17.5738   33.5939   58.7773  131.8720   165.4659
#> thyroid        3.2561   10.8821   26.1895   77.4828    88.3649
#> remainder     59.8172  125.6464  208.2837  847.6342   973.2805
#> oral           4.2359   10.9473   18.9997   80.5077    91.4550
#> rectum         0.0194   10.4333   22.5134  583.8140   594.2473
#> gallbladder  -26.8679   -2.8248   19.0988  355.1633   352.3385
#> kidney        13.8423   63.2177  114.2202  108.0992   171.3170
#> leukemia       0.0000    0.0000    0.0000    0.0000     0.0000
#> solid       2065.0362 2435.2677 2829.0276 7703.4925 10138.7602
#> total       2065.0362 2435.2677 2829.0276 7703.4925 10138.7602
#> 
#> Confidence Level: 0.9
#> Current Year: 2025
#> ---
#> 
#> summaries of LAR result : Group 3 
#> 
#> Group Information: 
#>     sex count birth
#>  female     1  1933
#>    male     2  2004
#> 
#> LAR: 
#>             Lower      Mean     Upper       LBR    LFR
#> colon    1209.822 1769.6169 2469.1052 2222.3599 0.7963
#> oral        6.602   17.1635   29.4681   62.4035 0.2750
#> rectum     13.638  298.1917  594.8688 1749.5705 0.1704
#> leukemia    0.000    0.0000    0.0000    0.0000    NaN
#> solid    1372.524 2084.9721 2843.2003 4034.3339 0.5168
#> total    1372.524 2084.9721 2843.2003 4034.3339 0.5168
#> 
#> Future LAR: 
#>              Lower      Mean     Upper       BFR       TFR
#> colon    1208.5828 1768.1856 2467.4248 2229.7148 3997.9004
#> oral        3.8786    9.8786   16.6260   26.5337   36.4123
#> rectum     13.6128  298.3636  595.3324 1755.7242 2054.0878
#> leukemia    0.0000    0.0000    0.0000    0.0000    0.0000
#> solid    1366.3716 2076.4278 2835.7201 4011.9727 6088.4005
#> total    1366.3716 2076.4278 2835.7201 4011.9727 6088.4005
#> 
#> Confidence Level: 0.9
#> Current Year: 2025
#> ---

The result of the LAR_group is similar to those of LAR_batch. The difference is the Group Information tab, which provides the gender frequency table within the group and the average birth-year within the group, instead of each individuals’ gender and birth-year. The risks are the estimates of the average LAR in groups.



Write the result in a file

LARisk includes the functions which write a result of LAR, LAR_batch, and LAR_group. write_LAR is the function that saves the LAR class family into a CSV file.

write_LAR(x, filename)

In this function, x is an object that wants to save into a CSV file. When you put the file name or connection to write into filename, the object is saved there. Note that if there exists the csv file which has the same title with filename, it would be overlapped. Therefore, before deciding a file name, be cautious to check whether or not the name is duplicated. In the same way as above, the result from the LAR batch function can be saved as a CSV file.


If the object is a LAR class object, the format of the saved file is that:

Lower Mean Upper F.Lower F.Mean F.Upper LBR BFR LFR TFR
site-name
solid
total

The function exports a table whose row is represented as site-names, solid, total, and whose column is the risks.


Since the LAR_batch class object is a list of LAR objects, it is difficult to export files in the same form as above. Thus, if the object’s class is LAR_batch, the function saves a file whose values are represented in a horizontal way for each organ, solid, and total.

Despite the case of the LAR function is somehow intuitive, the LAR_batch function is not simple. We make space for all organs, and values from the function are put in their own space. Therefore, there are 190 columns including the person ID column (PID), and the number of rows depends on the number of ids in the data. The columns are ordered in (LAR)-(Future LAR)-(Baseline Risk)-(Total Future Risk) in general. In LAR and Future LAR, each is made up of lower limit, upper limit, and mean values, and for the Baseline Risk, it is made up of baseline risk of exposed age, the baseline risk of attained age, and LFR. The last part is the total future risk for each site. Hence, for each component, there are values of all-organ, all-solid-cancer, and each organ, i.e. 21 elements. So that, the file has somehow wide shape with 210 columns.

If the class of the object is LAR_group, the format of the saved file is the same. In this case, the first column is GROUP instead of PID.



Examples

Now, consider the toy example of organ data. This data has 20 people which are exposed to radiation several times.

head(organ)
#>     ID  sex birth exposure       site exposure_rate   dosedist       dose1
#> 1 ID01 male  1985     2011 oesophagus       chronic fixedvalue 0.001954895
#> 2 ID01 male  1985     2011     kidney       chronic fixedvalue 0.003855487
#> 3 ID01 male  1985     2011     rectum       chronic fixedvalue 0.003855487
#> 4 ID01 male  1985     2011    thyroid       chronic fixedvalue 0.005104447
#> 5 ID01 male  1985     2013 oesophagus       chronic fixedvalue 0.089358392
#> 6 ID01 male  1985     2013     kidney       chronic fixedvalue 0.176234606
#>   dose2 dose3 occup
#> 1    NA    NA     1
#> 2    NA    NA     1
#> 3    NA    NA     1
#> 4    NA    NA     1
#> 5    NA    NA     1
#> 6    NA    NA     1

Assume that we want to calculate the risks with the current year is 2021. In this example, we calculate the risks for the population in Korea, in 2018.

First, the estimated risks of ‘ID01’ is that:

organ1 <- organ[organ$ID=='ID01',]
ex_organ1 <- LAR(organ1, baseda=list(life2018, incid2018), current=2021)

ex_organ1
#> LAR: 
#>  Lower   Mean  Upper 
#> 1.1149 1.6981 2.5132 
#> 
#> Future LAR: 
#>           Lower      Mean     Upper
#> F.LAR    1.1132    1.6759    2.4744
#> BFR   6694.6423 6694.6423 6694.6423
#> TFR   6695.7555 6696.3182 6697.1167
#> ---

The estimated LAR of the person ID01 is 1.6981 with the 90% confidence interval (1.1149, 2.5132). The future risk is 1.6759 with the 90% confidence interval (1.1132, 2.4744)

summary(ex_organ1)
#> Information: 
#>   sex birth
#>  male  1985
#> 
#> LAR: 
#>             Lower   Mean  Upper      LBR   LFR
#> thyroid    0.4673 0.9709 1.8205 1771.543 5e-04
#> oesophagus 0.1025 0.1824 0.2729 1048.947 2e-04
#> rectum     0.0978 0.2385 0.4111 2893.126 1e-04
#> kidney     0.1416 0.3064 0.5160 1338.149 2e-04
#> leukemia   0.0000 0.0000 0.0000    0.000   NaN
#> solid      1.1149 1.6981 2.5132 7051.764 2e-04
#> total      1.1149 1.6981 2.5132 7051.764 2e-04
#> 
#> Future LAR: 
#>             Lower   Mean  Upper      BFR      TFR
#> thyroid    0.4507 0.9512 1.7587 1453.492 1454.443
#> oesophagus 0.1025 0.1823 0.2728 1055.583 1055.766
#> rectum     0.0977 0.2379 0.4108 2877.117 2877.355
#> kidney     0.1406 0.3045 0.5153 1308.450 1308.755
#> leukemia   0.0000 0.0000 0.0000    0.000    0.000
#> solid      1.1132 1.6759 2.4744 6694.642 6696.318
#> total      1.1132 1.6759 2.4744 6694.642 6696.318
#> 
#> Confidence Level: 0.9
#> Current Year: 2021
#> ---

With summary, we can get a more detailed report of the result. By the result, the person ID01 is a man born in 1985. This person was exposed radiation to thyroid, oesophagus, ‘rectum’, and kidney. Since leukemia is not included in this data, the result for leukemia is zero.



Consider the risks of the female / male groups of the organ.

ex_organ2 <- LAR_group(organ, pid=organ$ID, group=organ$sex,
                       basedata=list(life2018, incid2018), current=2021)

summary(ex_organ2)
#> summaries of LAR result : Group female 
#> 
#> Group Information: 
#>     sex count    birth
#>  female   166 1976.313
#> 
#> LAR: 
#>               Lower    Mean   Upper       LBR     LFR
#> colon        0.6832  1.1392  1.7064 1080.1190  0.0011
#> lung         2.3739  2.8990  3.5941 1353.2750  0.0021
#> uterus       0.1447  0.2874  0.4696  338.1363  0.0008
#> bladder      0.9560  1.3026  1.7539  155.9481  0.0084
#> remainder    2.9905  4.5565  6.7692 1875.6871  0.0024
#> oral         0.1509  0.2311  0.3413  130.7238  0.0018
#> oesophagus   0.0203  0.0768  0.1583   18.0691  0.0043
#> rectum       0.2375  0.4315  0.6480  873.1753  0.0005
#> gallbladder -0.1736 -0.0274  0.1067  290.7825 -0.0001
#> pancreas     0.0661  0.1089  0.1608  283.0072  0.0004
#> leukemia     0.0792  0.1800  0.4092   81.1517  0.0022
#> solid        9.3445 11.0056 13.3345 6398.9234  0.0017
#> total        9.5265 11.1856 13.5145 6480.0751  0.0017
#> 
#> Future LAR: 
#>               Lower    Mean   Upper       BFR       TFR
#> colon        0.6658  1.0964  1.6314 1051.6541 1052.7505
#> lung         2.3673  2.8906  3.5837 1352.5231 1355.4137
#> uterus       0.1423  0.2825  0.4686  296.3270  296.6095
#> bladder      0.9503  1.2916  1.7359  154.4180  155.7096
#> remainder    2.8942  4.4153  6.5409 1792.3971 1796.8124
#> oral         0.1444  0.2181  0.3177  119.1196  119.3377
#> oesophagus   0.0201  0.0762  0.1568   18.0923   18.1685
#> rectum       0.2346  0.4186  0.6327  835.6204  836.0390
#> gallbladder -0.1735 -0.0274  0.1066  292.4045  292.3771
#> pancreas     0.0653  0.1079  0.1591  281.1836  281.2915
#> leukemia     0.0944  0.1773  0.3603   77.5662   77.7435
#> solid        9.1382 10.7697 12.9982 6193.7399 6204.5095
#> total        9.3172 10.9470 13.1770 6271.3060 6282.2530
#> 
#> Confidence Level: 0.9
#> Current Year: 2021
#> ---
#> 
#> summaries of LAR result : Group male 
#> 
#> Group Information: 
#>   sex count    birth
#>  male   805 1966.561
#> 
#> LAR: 
#>               Lower    Mean   Upper        LBR     LFR
#> stomach      3.3791  3.8814  4.5159  2861.1179  0.0014
#> colon        2.9728  3.5787  4.2951   891.9982  0.0040
#> liver        0.7400  0.9784  1.2916   328.3202  0.0030
#> lung         1.7903  2.1896  2.7012  1379.3554  0.0016
#> prostate    -0.1064  0.8221  1.7976  1553.0173  0.0005
#> bladder      0.9763  1.2158  1.5097   530.2854  0.0023
#> brain/cns    0.1014  0.1315  0.1718    75.0266  0.0018
#> thyroid      1.1497  1.6947  2.4809   508.2328  0.0033
#> remainder    7.3442  8.7682 10.4821  1396.0668  0.0063
#> oral         0.4086  0.5480  0.7018   355.2983  0.0015
#> oesophagus   0.4106  0.4973  0.5973   299.5552  0.0017
#> rectum       0.3106  0.4829  0.6906   619.8016  0.0008
#> gallbladder -0.2683 -0.0629  0.1323   281.3131 -0.0002
#> pancreas     0.4762  0.6373  0.8672   262.5764  0.0024
#> kidney       0.6250  0.8268  1.0208   477.9452  0.0017
#> leukemia     0.4202  0.9776  2.2777    98.4367  0.0099
#> solid       23.8700 26.1898 28.7939 11819.9106  0.0022
#> total       24.8542 27.1674 29.7754 11918.3472  0.0023
#> 
#> Future LAR: 
#>               Lower    Mean   Upper        BFR        TFR
#> stomach      2.9529  3.3947  3.9262  2755.3275  2758.7222
#> colon        2.7211  3.2932  3.9434   870.3818   873.6751
#> liver        0.6131  0.8033  1.0509   274.2192   275.0226
#> lung         1.7390  2.1254  2.6232  1389.8859  1392.0113
#> prostate    -0.1170  0.8140  1.7755  1582.8508  1583.6647
#> bladder      0.9560  1.1886  1.4847   533.9888   535.1774
#> brain/cns    0.0800  0.1066  0.1424    64.5289    64.6355
#> thyroid      0.6714  0.9920  1.3906   331.6146   332.6066
#> remainder    6.5417  7.7960  9.4301  1277.7412  1285.5372
#> oral         0.3365  0.4629  0.6085   326.9197   327.3826
#> oesophagus   0.3778  0.4561  0.5447   296.6060   297.0622
#> rectum       0.2574  0.4175  0.6129   582.9204   583.3379
#> gallbladder -0.2587 -0.0614  0.1289   285.0148   284.9534
#> pancreas     0.4322  0.5781  0.7887   256.1738   256.7519
#> kidney       0.4992  0.6462  0.7967   409.5781   410.2243
#> leukemia     0.3114  0.7470  1.7946    83.8317    84.5786
#> solid       20.8728 23.0132 25.3536 11237.7516 11260.7648
#> total       21.6190 23.7602 26.1058 11321.5832 11345.3434
#> 
#> Confidence Level: 0.9
#> Current Year: 2021
#> ---

By the result, the estimated average lifetime risk of a female group is 11.1856 (9.5265, 13.5145). Similarly, the estimated average lifetime risk of a male group is 27.1674 (23.8700, 28.7939).


We can also set the variables for group. For example, we want the average risks of female and occup is 1

ex_organ3 <- LAR_group(organ, pid=organ$ID, group=list(organ$sex, organ$occup),
                       basedata=list(life2018, incid2018), current=2021)

print(ex_organ3, max.id=3)
#> LAR result of female.1 
#> 
#> LAR: 
#>  Lower   Mean  Upper 
#> 4.7547 6.2051 8.0758 
#> 
#> Future LAR: 
#>           Lower      Mean     Upper
#> F.LAR    4.7348    6.1773    8.0388
#> BFR   3746.8142 3746.8142 3746.8142
#> TFR   3751.5490 3752.9915 3754.8530
#> ---
#> 
#> LAR result of male.1 
#> 
#> LAR: 
#>   Lower    Mean   Upper 
#> 37.9314 41.3941 45.0670 
#> 
#> Future LAR: 
#>            Lower       Mean      Upper
#> F.LAR    32.3847    35.4936    38.7843
#> BFR   12300.1490 12300.1490 12300.1490
#> TFR   12332.5337 12335.6426 12338.9333
#> ---
#> 
#> LAR result of female.4 
#> 
#> LAR: 
#>  Lower   Mean  Upper 
#> 0.1310 0.4694 0.8903 
#> 
#> Future LAR: 
#>          Lower     Mean    Upper
#> F.LAR   0.1299   0.4655   0.8830
#> BFR   108.5540 108.5540 108.5540
#> TFR   108.6839 109.0195 109.4369
#> ---
#> 
#> The results for 4 groups are omitted.



APPENDIX: Datasets in LARisk

The LARisk package include two toy example datasets, nuclear and organ. These datasets are simulated assuming two situation: One is that all people were exposed to radiation at the same time, and the other is that each person was exposed to radiation over a long period of time. Each data has 11 variables, including 9 essential variables for calculating the LAR.

nuclear: a simulated dataset assuming radioactive explosion

nuclear was simulated assuming the scenario in which everyone is exposed to radiation at the same time. This data includes 20 people, who were exposed to radiation at the same time in 2011. The age exposed to radiation is from 3 to 81 years old, and there are 10 males and 10 females. All values of exposure_rate are acute and all values of dosedist are fixedvalue.

str(nuclear)
#> 'data.frame':    100 obs. of  11 variables:
#>  $ ID           : chr  "ID01" "ID01" "ID01" "ID01" ...
#>  $ sex          : chr  "female" "female" "female" "female" ...
#>  $ birth        : int  1973 1973 1973 1973 1973 1973 1973 1981 1981 1981 ...
#>  $ exposure     : num  2011 2011 2011 2011 2011 ...
#>  $ site         : chr  "ovary" "oesophagus" "bladder" "lung" ...
#>  $ exposure_rate: chr  "acute" "acute" "acute" "acute" ...
#>  $ dosedist     : chr  "fixedvalue" "fixedvalue" "fixedvalue" "fixedvalue" ...
#>  $ dose1        : num  50.1 50.4 52.5 55.7 51.6 ...
#>  $ dose2        : logi  NA NA NA NA NA NA ...
#>  $ dose3        : logi  NA NA NA NA NA NA ...
#>  $ distance     : chr  "1" "1" "1" "1" ...

ID is the variable that is used to identify the individual. We generated the sex, birth, and site fully random. And the exposure dose (dose1) was generated from the log-normal distribution, and a variable called distance was created by dividing it into three groups.


organ: a simulated dataset assuming the workers at interventional radiology departments

Unlike nuclear, organ assumes that people have been exposed to radiation over several times. There are 20 people in this data, 14 of whom are male and 6 are female. Also, this data includes job information of people (occup).

people in organ dataset
ID sex birth occup ID sex birth occup
ID01 male 1985 1 ID11 male 1965 6
ID02 male 1960 1 ID12 male 1976 1
ID03 male 1979 6 ID13 female 1986 5
ID04 male 1982 1 ID14 male 1983 1
ID05 male 1981 6 ID15 male 1980 1
ID06 male 1966 6 ID16 female 1980 6
ID07 female 1980 1 ID17 male 1982 6
ID08 female 1980 1 ID18 female 1968 5
ID09 male 1992 1 ID19 male 1965 1
ID10 female 1984 4 ID20 male 1983 5
str(organ)
#> 'data.frame':    971 obs. of  11 variables:
#>  $ ID           : chr  "ID01" "ID01" "ID01" "ID01" ...
#>  $ sex          : chr  "male" "male" "male" "male" ...
#>  $ birth        : num  1985 1985 1985 1985 1985 ...
#>  $ exposure     : num  2011 2011 2011 2011 2013 ...
#>  $ site         : chr  "oesophagus" "kidney" "rectum" "thyroid" ...
#>  $ exposure_rate: chr  "chronic" "chronic" "chronic" "chronic" ...
#>  $ dosedist     : chr  "fixedvalue" "fixedvalue" "fixedvalue" "fixedvalue" ...
#>  $ dose1        : num  0.00195 0.00386 0.00386 0.0051 0.08936 ...
#>  $ dose2        : num  NA NA NA NA NA NA NA NA NA NA ...
#>  $ dose3        : num  NA NA NA NA NA NA NA NA NA NA ...
#>  $ occup        : chr  "1" "1" "1" "1" ...

All values of exposure_rate are chronic and all values of dosedist are fixedvalue. The birth-year of people has a range from 1960 to 1992, and the exposed age is from 23 to 60 years old.

sex, birth, site, and occup were randomly selected, and exposure was generated before 2021 (This means that this data assumed that the current year is 2021). The exposure dose (dose1) was generated from the Gaussian mixture distribution, which mimics data of workers at interventional radiology departments in Korea (Lee, et al., 2021).



Reference

  1. De Gonzalez, A. B., et al. (2012). RadRAT: a radiation risk assessment tool for lifetime cancer risk projection. Journal of Radiological Protection, 32(3), 205.

  2. Lee, W. J., Bang, Y. J., Cha, E. S., Kim, Y. M., & Cho, S. B. (2021). Lifetime cancer risks from occupational radiation exposure among workers at interventional radiology departments. International Archives of Occupational and Environmental Health, 94(1), 139-145.