Title: | Processing of Accelerometry Data with 'GGIR' in mMARCH |
---|---|
Description: | Mobile Motor Activity Research Consortium for Health (mMARCH) is a collaborative network of studies of clinical and community samples that employ common clinical, biological, and digital mobile measures across involved studies. One of the main scientific goals of mMARCH sites is developing a better understanding of the inter-relationships between accelerometry-measured physical activity (PA), sleep (SL), and circadian rhythmicity (CR) and mental and physical health in children, adolescents, and adults. Currently, there is no consensus on a standard procedure for a data processing pipeline of raw accelerometry data, and few open-source tools to facilitate their development. The R package 'GGIR' is the most prominent open-source software package that offers great functionality and tremendous user flexibility to process raw accelerometry data. However, even with 'GGIR', processing done in a harmonized and reproducible fashion requires a non-trivial amount of expertise combined with a careful implementation. In addition, novel accelerometry-derived features of PA/SL/CR capturing multiscale, time-series, functional, distributional and other complimentary aspects of accelerometry data being constantly proposed and become available via non-GGIR R implementations. To address these issues, mMARCH developed a streamlined harmonized and reproducible pipeline for loading and cleaning raw accelerometry data, extracting features available through 'GGIR' as well as through non-GGIR R packages, implementing several data and feature quality checks, merging all features of PA/SL/CR together, and performing multiple analyses including Joint Individual Variation Explained (JIVE), an unsupervised machine learning dimension reduction technique that identifies latent factors capturing joint across and individual to each of three domains of PA/SL/CR. In detail, the pipeline generates all necessary R/Rmd/shell files for data processing after running 'GGIR' for accelerometer data. In module 1, all csv files in the 'GGIR' output directory were read, transformed and then merged. In module 2, the 'GGIR' output files were checked and summarized in one excel sheet. In module 3, the merged data was cleaned according to the number of valid hours on each night and the number of valid days for each subject. In module 4, the cleaned activity data was imputed by the average Euclidean norm minus one (ENMO) over all the valid days for each subject. Finally, a comprehensive report of data processing was created using Rmarkdown, and the report includes few exploratory plots and multiple commonly used features extracted from minute level actigraphy data. Reference: Guo W, Leroux A, Shou S, Cui L, Kang S, Strippoli MP, Preisig M, Zipunnikov V, Merikangas K (2022) Processing of accelerometry data with GGIR in Motor Activity Research Consortium for Health (mMARCH) Journal for the Measurement of Physical Behaviour, 6(1): 37-44. |
Authors: | Wei Guo [aut, cre], Andrew Leroux [aut], Vadim Zipunnikov [aut], Kathleen Merikangas [aut] |
Maintainer: | Wei Guo <[email protected]> |
License: | GPL-3 |
Version: | 2.9.4.0 |
Built: | 2024-12-22 06:53:26 UTC |
Source: | CRAN |
A parametric approach to study circadian rhythmicity assuming cosinor shape.This function is a whole dataset
wrapper for ActCosinor
.
ActCosinor_long2(count.data, window = 1, daylevel = FALSE)
ActCosinor_long2(count.data, window = 1, daylevel = FALSE)
count.data |
|
window |
|
daylevel |
|
A data.frame
with the following 5 columns
ID |
ID |
ndays |
number of days |
mes |
MESRO, which is short for midline statistics of rhythm, which is a rhythm adjusted mean. This represents mean activity level. |
amp |
amplitude, a measure of half the extend of predictable variation within a cycle. This represents the highest activity one can achieve. |
acro |
acrophase, a meaure of the time of the overall high values recurring in each cycle. Here it has a unit of radian. This represents time to reach the peak. |
acrotime |
acrophase in the unit of the time (hours) |
ndays |
Number of days modeled |
A parametric approach to study circadian rhythmicity assuming cosinor shape.
ActCosinor2(x, window = 1, n1440 = 1440)
ActCosinor2(x, window = 1, n1440 = 1440)
x |
|
window |
The calculation needs the window size of the data. E.g window = 1 means each epoch is in one-minute window. |
n1440 |
the number of points of a day. Default is 1440 for the minute-level data. |
A list with elements
mes |
MESOR which is short for midline statistics of rhythm, which is a rhythm adjusted mean. This represents mean activity level. |
amp |
amplitude, a measure of half the extend of predictable variation within a cycle. This represents the highest activity one can achieve. |
acro |
acrophase, a meaure of the time of the overall high values recurring in each cycle. Here it has a unit of radian. This represents time to reach the peak. |
acrotime |
acrophase in the unit of the time (hours) |
ndays |
Number of days modeled |
Cornelissen, G. Cosinor-based rhythmometry. Theor Biol Med Model 11, 16 (2014). https://doi.org/10.1186/1742-4682-11-16
Extended cosinor model based on sigmoidally transformed cosine curve using anti-logistic transformation.This function is a whole dataset
wrapper for ActExtendCosinor
.
ActExtendCosinor_long2( count.data, window = 1, lower = c(0, 0, -1, 0, -3), upper = c(Inf, Inf, 1, Inf, 27), daylevel = FALSE )
ActExtendCosinor_long2( count.data, window = 1, lower = c(0, 0, -1, 0, -3), upper = c(Inf, Inf, 1, Inf, 27), daylevel = FALSE )
count.data |
|
window |
|
lower |
|
upper |
|
daylevel |
|
A data.frame
with the following 5 columns
ID |
ID |
ndays |
number of days |
minimum |
Minimum value of the of the function. |
amp |
amplitude, a measure of half the extend of predictable variation within a cycle. This represents the highest activity one can achieve. |
alpha |
It determines whether the peaks of the curve are wider than the troughs: when alpha is small, the troughs are narrow and the peaks are wide; when alpha is large, the troughs are wide and the peaks are narrow. |
beta |
It dertermines whether the transformed function rises and falls more steeply than the cosine curve: large values of beta produce curves that are nearly square waves. |
acrotime |
acrophase is the time of day of the peak in the unit of the time (hours) |
F_pseudo |
Measure the improvement of the fit obtained by the non-linear estimation of the transformed cosine model |
UpMesor |
Time of day of switch from low to high activity. Represents the timing of the rest- activity rhythm. Lower (earlier) values indicate increase in activity earlier in the day and suggest a more advanced circadian phase. |
DownMesor |
Time of day of switch from high to low activity. Represents the timing of the rest-activity rhythm. Lower (earlier) values indicate decline in activity earlier in the day, suggesting a more advanced circadian phase. |
MESOR |
A measure analogous to the MESOR of the cosine model (or half the deflection of the curve) can be obtained from mes=min+amp/2. However, it goes through the middle of the peak, and is therefore not equal to the MESOR of the cosine model, which is the mean of the data. |
Extended cosinor model based on sigmoidally transformed cosine curve using anti-logistic transformation
ActExtendCosinor2( x, window = 1, lower = c(0, 0, -1, 0, -3), upper = c(Inf, Inf, 1, Inf, 27), n1440 = 1440 )
ActExtendCosinor2( x, window = 1, lower = c(0, 0, -1, 0, -3), upper = c(Inf, Inf, 1, Inf, 27), n1440 = 1440 )
x |
|
window |
The calculation needs the window size of the data. E.g window = 1 means each epoch is in one-minute window. |
lower |
A numeric vector of lower bounds on each of the five parameters (in the order of minimum, amplitude, alpha, beta, acrophase) for the NLS. If not given, the default lower bound for each parameter is set to |
upper |
A numeric vector of upper bounds on each of the five parameters (in the order of minimum, amplitude, alpha, beta, acrophase) for the NLS. If not given, the default lower bound for each parameter is set to |
n1440 |
the number of points of a day. Default is 1440 for the minute-level data. |
A list with elements
minimum |
Minimum value of the of the function. |
amp |
amplitude, a measure of half the extend of predictable variation within a cycle. This represents the highest activity one can achieve. |
alpha |
It determines whether the peaks of the curve are wider than the troughs: when alpha is small, the troughs are narrow and the peaks are wide; when alpha is large, the troughs are wide and the peaks are narrow. |
beta |
It dertermines whether the transformed function rises and falls more steeply than the cosine curve: large values of beta produce curves that are nearly square waves. |
acrotime |
acrophase is the time of day of the peak in the unit of the time (hours) |
F_pseudo |
Measure the improvement of the fit obtained by the non-linear estimation of the transformed cosine model |
UpMesor |
Time of day of switch from low to high activity. Represents the timing of the rest- activity rhythm. Lower (earlier) values indicate increase in activity earlier in the day and suggest a more advanced circadian phase. |
DownMesor |
Time of day of switch from high to low activity. Represents the timing of the rest-activity rhythm. Lower (earlier) values indicate decline in activity earlier in the day, suggesting a more advanced circadian phase. |
MESOR |
A measure analogous to the MESOR of the cosine model (or half the deflection of the curve) can be obtained from mes=min+amp/2. However, it goes through the middle of the peak, and is therefore not equal to the MESOR of the cosine model, which is the mean of the data. |
ndays |
Number of days modeled. |
Marler MR, Gehrman P, Martin JL, Ancoli-Israel S. The sigmoidally transformed cosine curve: a mathematical model for circadian rhythms with symmetric non-sinusoidal shapes. Stat Med.
Bin minute level data into different time resolutions
bin_data2(x = x, window = 1, method = c("average", "sum"))
bin_data2(x = x, window = 1, method = c("average", "sum"))
x |
|
window |
window size used to bin the original 1440 dimensional data into. Window size should be an integer factor of 1440 |
method |
|
a vector of binned data
Create a template shell script of mMARCH.AC, named as STUDYNAME_part0.maincall.R.
create.shell()
create.shell()
The function will create a template shell script of mMARCH.AC in the current directory, names as STUDYNAME_part0.maincall.R
Data imputation for the merged ENMO data with annotation. The missing values were imputated by the average ENMO over all the valid days for each subject.
data.imputation(workdir, csvInput = NULL)
data.imputation(workdir, csvInput = NULL)
workdir |
|
csvInput |
|
Files were written to the specified sub-directory, named as impu.flag_All_studyname_ENMO.data.Xs.csv, which Xs is the epoch size to which acceleration was averaged (seconds) in GGIR output. This excel file includs the following columns,
filename |
accelerometer file name |
Date |
date recored from the GGIR part2.summary file |
id |
IDs recored from the GGIR part2.summary file |
calender_date |
date in the format of yyyy-mm-dd |
N.valid.hours |
number of hours with valid data recored from the part2_daysummary.csv file in the GGIR output |
N.hours |
number of hours of measurement recored from the part2_daysummary.csv file in the GGIR output |
weekday |
day of the week-Day of the week |
measurementday |
day of measurement-Day number relative to start of the measurement |
newID |
new IDs defined as the user-defined function of filename2id(), e.g. substrings of the filename |
Nmiss_c9_c31 |
number of NAs from the 9th to 31th column in the part2_daysummary.csv file in the GGIR output |
missing |
"M" indicates missing for an invalid day, and "C" indicates completeness for a valid day |
Ndays |
number of days of measurement |
ith_day |
rank of the measurementday, for example, the value is 1,2,3,4,-3,-2,-1 for measurementday = 1,...,7 |
Nmiss |
number of missing (invalid) days |
Nnonmiss |
number of non-missing (valid) days |
misspattern |
indicators of missing/nonmissing for all measurement days at the subject level |
RowNonWear |
number of columnns in the non-wearing matrix |
NonWearMin |
number of minutes of non-wearing |
daysleeper |
If 0 then the person is a nightsleeper (sleep period did not overlap with noon) if value=1 then the person is a daysleeper (sleep period did overlap with noon). |
remove16h7day |
indicator of a key qulity control output. If remove16h7day=1, the day need to be removed. If remove16h7day=0, the day need to be kept. |
duplicate |
If duplicate="remove", the accelerometer files will not be used in the data analysis of module5. |
ImpuMiss.b |
number of missing values on the ENMO data before imputation |
ImpuMiss.a |
number of missing values on the ENMO data after imputation |
KEEP |
The value is "keep"/"remove", e.g. KEEP="remove" if remove16h7day=1 or duplicate="remove" or ImpuMiss.a>0 |
Annotating the merged ENMO/ANGLEZ data by adding some descriptive variables such as number of valid days and missing pattern.
DataShrink( studyname, outputdir, workdir, QCdays.alpha = 7, QChours.alpha = 16, summaryFN = "../summary/part24daysummary.info.csv", epochIn = 5, epochOut = 60, useIDs.FN = NULL, RemoveDaySleeper = FALSE, trace = FALSE )
DataShrink( studyname, outputdir, workdir, QCdays.alpha = 7, QChours.alpha = 16, summaryFN = "../summary/part24daysummary.info.csv", epochIn = 5, epochOut = 60, useIDs.FN = NULL, RemoveDaySleeper = FALSE, trace = FALSE )
studyname |
|
outputdir |
|
workdir |
|
QCdays.alpha |
|
QChours.alpha |
|
summaryFN |
|
epochIn |
|
epochOut |
|
useIDs.FN |
|
RemoveDaySleeper |
|
trace |
|
Files were written to the specified sub-directory, named as flag_ALL_studyname_ENMO.data.Xs.csv and flag_ALL_studyname_ANGLEZ.data.Xs.csv, which Xs is the epoch size to which acceleration was averaged (seconds) in GGIR output. This excel file includs the following columns,
filename |
accelerometer file name |
Date |
date recored from the GGIR part2.summary file |
id |
IDs recored from the GGIR part2.summary file |
calender_date |
date in the format of yyyy-mm-dd |
N.valid.hours |
number of hours with valid data recored from the part2_daysummary.csv file in the GGIR output |
N.hours |
number of hours of measurement recored from the part2_daysummary.csv file in the GGIR output |
weekday |
day of the week-Day of the week |
measurementday |
day of measurement-Day number relative to start of the measurement |
newID |
new IDs defined as the user-defined function of filename2id(), e.g. substrings of the filename |
Nmiss_c9_c31 |
number of NAs from the 9th to 31th column in the part2_daysummary.csv file in the GGIR output |
missing |
"M" indicates missing for an invalid day, and "C" indicates completeness for a valid day |
Ndays |
number of days of measurement |
ith_day |
rank of the measurementday, for example, the value is 1,2,3,4,-3,-2,-1 for measurementday = 1,...,7 |
Nmissday |
number of missing (invalid) days |
Nnonmiss |
number of non-missing (valid) days |
misspattern |
indicators of missing/nonmissing for all measurement days at the subject level |
RowNonWear |
number of columnns in the non-wearing matrix |
NonWearMin |
number of minutes of non-wearing |
Nvalid.day |
number of valid days with/without removing daysleeper nights; It is equal to Nnonmiss when RemoveDaySleeper=FALSE. |
daysleeper |
If 0 then the person is a nightsleeper (sleep period did not overlap with noon) if value=1 then the person is a daysleeper (sleep period did overlap with noon) at the night. This is a night-level varialbe. |
remove16h7day |
indicator of a key qulity control output. If remove16h7day=1, the day need to be removed. If remove16h7day=0, the day need to be kept. |
duplicate |
If duplicate="remove", the accelerometer files will not be used in the data analysis of module5-7. |
Fragmentation methods to study the transition between two states, e.g.
sedentary v.s. active.This function is a whole dataset wrapper for fragmentation
fragmentation_long2( count.data, weartime, thresh, bout.length = 1, metrics = c("mean_bout", "TP", "Gini", "power", "hazard", "all"), by = c("day", "subject") )
fragmentation_long2( count.data, weartime, thresh, bout.length = 1, metrics = c("mean_bout", "TP", "Gini", "power", "hazard", "all"), by = c("day", "subject") )
count.data |
|
weartime |
|
thresh |
threshold to define the two states. |
bout.length |
minimum duration of defining an active bout; defaults to 1. |
metrics |
What is the fragmentation metrics to exract. Can be "mean_bout","TP","Gini","power","hazard",or all the above metrics "all". |
by |
Determine whether fragmentation is calcualted by day or by subjects (i.e. aggregate bouts across days). by-subject is recommended to gain more power. |
Metrics include mean_bout (mean bout duration), TP (between states transition probability), Gini (gini index), power (alapha parameter for power law distribution) hazard (average hazard function)
A dataframe with some of the following columns
ID |
identifier of the person |
Day |
|
mean_r |
mean sedentary bout duration |
mean_a |
mean active bout duration |
SATP |
sedentary to active transition probability |
ASTP |
bactive to sedentary transition probability |
Gini_r |
Gini index for active bout |
Gini_a |
Gini index for sedentary bout |
h_r |
hazard function for sedentary bout |
h_a |
hazard function for active bout |
alpha_r |
power law parameter for sedentary bout |
alpha_a |
power law parameter for active bout |
Fragmentation methods to study the transition between two states, e.g. sedentary v.s. active.
fragmentation2( x, w, thresh, bout.length = 1, metrics = c("mean_bout", "TP", "Gini", "power", "hazard", "all") )
fragmentation2( x, w, thresh, bout.length = 1, metrics = c("mean_bout", "TP", "Gini", "power", "hazard", "all") )
x |
|
w |
|
thresh |
threshold to binarize the data. |
bout.length |
minimum duration of defining an active bout; defaults to 1. |
metrics |
What is the fragmentation metrics to exract. Can be "mean_bout","TP","Gini","power","hazard",or all the above metrics "all". |
Metrics include mean_bout (mean bout duration), TP (between states transition probability), Gini (gini index), power (alapha parameter for power law distribution) hazard (average hazard function)
A list with elements
mean_r |
mean sedentary bout duration |
mean_a |
mean active bout duration |
SATP |
sedentary to active transition probability |
ASTP |
bactive to sedentary transition probability |
Gini_r |
Gini index for active bout |
Gini_a |
Gini index for sedentary bout |
h_r |
hazard function for sedentary bout |
h_a |
hazard function for active bout |
alpha_r |
power law parameter for sedentary bout |
alpha_a |
power law parameter for active bout |
Junrui Di, Andrew Leroux, Jacek Urbanek, Ravi Varadhan, Adam P. Spira, Jennifer Schrack, Vadim Zipunnikov. Patterns of sedentary and active time accumulation are associated with mortality in US adults: The NHANES study. bioRxiv 182337; doi: https://doi.org/10.1101/182337
A function for calcualting the average timing of variables (in this case the M10 and L5). Find the average timing mu that min( sum ( min( (tind_i - mu)^2, (1440 + mu - tind_i )^2 )))
get_mean_sd_hour(tind, unit2minute = 60, out = c("mean", "sd"))
get_mean_sd_hour(tind, unit2minute = 60, out = c("mean", "sd"))
tind |
|
unit2minute |
|
out |
|
mean and sd of the input timing
x=c(1,1,1,23,23,23) get_mean_sd_hour(tind=x, unit2minute=60) x=12+c(1,1,1,23,23,23) get_mean_sd_hour(tind=x, unit2minute=60) x=c(1:100/5, 20+4:50/200) get_mean_sd_hour(tind=x, unit2minute=60)
x=c(1,1,1,23,23,23) get_mean_sd_hour(tind=x, unit2minute=60) x=12+c(1,1,1,23,23,23) get_mean_sd_hour(tind=x, unit2minute=60) x=c(1:100/5, 20+4:50/200) get_mean_sd_hour(tind=x, unit2minute=60)
An accelerometer file was transformed into wide data matrix, in which the rows represent available days and the columns including all timestamps for 24 hours. Further, the wide data was merged together.
ggir.datatransform( outputdir, subdir, studyname, numericID = FALSE, sortByid = "newID", f0 = 1, f1 = 1e+06, epochIn = 5, epochOut = 5, DoubleHour = c("average", "earlier", "later"), mergeVar = 1 )
ggir.datatransform( outputdir, subdir, studyname, numericID = FALSE, sortByid = "newID", f0 = 1, f1 = 1e+06, epochIn = 5, epochOut = 5, DoubleHour = c("average", "earlier", "later"), mergeVar = 1 )
outputdir |
|
subdir |
|
studyname |
|
numericID |
|
sortByid |
|
f0 |
|
f1 |
|
epochIn |
|
epochOut |
|
DoubleHour |
|
mergeVar |
|
mergeVar = 1 |
Six files were written to the specified sub-directory as follows, |
nonwearscore_studyname_f0_f1_Xs.xlsx |
Data matrix of nonwearscore, where f0 and f1 are the file index to start and finish with and Xs is the epoch size to which acceleration was averaged (seconds) in GGIR output. |
clippingscore_studyname_f0_f1_Xs.xlsx |
Data matrix of clippingscore |
lightmean_studyname_f0_f1_Xs.xlsx |
Data matrix of lightmean |
lightpeak_studyname_f0_f1_Xs.xlsx |
Data matrix of lightpeak |
temperaturemean_studyname_f0_f1_Xs.xlsx |
Data matrix of temperaturemean |
EN_studyname_f0_f1_Xs.xlsx |
Data matrix of EN |
mergeVar = 2 |
Two files were written to the specified sub-directory as follows, |
studyname_ENMO.dataf0_f1_Xs.xlsx |
Data matrix of ENMO, where f0 and f1 are the file index to start and finish with and Xs is the epoch size to which acceleration was averaged (seconds) in GGIR output. |
studyname_ANGLEZ.dataf0_f1_Xs.xlsx |
Data matrix of ANGLEZ |
Description of all accelerometer files in the GGIR output and this script was executed when mode=2 in the main call.
ggir.summary( bindir = NULL, outputdir, studyname, numericID = FALSE, sortByid = "filename", subdir = "summary", part5FN = "WW_L50M125V500_T5A5", QChours.alpha = 16, filename2id = NULL, desiredtz = "US/Eastern", trace = FALSE )
ggir.summary( bindir = NULL, outputdir, studyname, numericID = FALSE, sortByid = "filename", subdir = "summary", part5FN = "WW_L50M125V500_T5A5", QChours.alpha = 16, filename2id = NULL, desiredtz = "US/Eastern", trace = FALSE )
bindir |
|
outputdir |
|
studyname |
|
numericID |
|
sortByid |
|
subdir |
|
part5FN |
|
QChours.alpha |
|
filename2id |
|
desiredtz |
|
trace |
|
Four files were written to the specified sub-directory
studyname_ggir_output_summary.xlsx |
This excel file includs 9 pages as follows, |
page 1 |
List of files in the GGIR output |
page 2 |
Summary of files |
page 3 |
List of duplicate IDs |
page 4 |
ID errors |
page 5 |
Number of valid days |
page 6 |
Table of number of valid/missing days |
page 7 |
Missing patten |
page 8 |
Frequency of the missing pattern |
page 9 |
Description of all accelerometer files |
page 10 |
Inspects accelerometer file for key information, including: monitor brand, sample frequency and file header |
studyname_ggir_output_summary_plot.pdf |
Some plots such as the number of valid days, which were included in the module5_studyname_Data_process_report.html file as well. |
part24daysummary.info.csv |
Intermediate results for description of each accelerometer file. |
studyname_samples_remove_temp.csv |
Create studyname_samples_remove.csv file by filling "remove" in the "duplicate" column in this template. If duplicate="remove", the accelerometer files will not be used in the data analysis of module 5-7. |
This function calcualte interdaily stability, a nonparametric metric
of circadian rhtymicity. This function is a whole dataset
wrapper for IS
IS_long2(count.data, window = 1, method = c("average", "sum"))
IS_long2(count.data, window = 1, method = c("average", "sum"))
count.data |
|
window |
an |
method |
|
A data.frame
with the following 2 columns
ID |
ID |
IS |
IS |
Junrui Di et al. Joint and individual representation of domains of physical activity, sleep, and circadian rhythmicity. Statistics in Biosciences.
This function calcualte interdaily stability, a nonparametric metric of circadian rhtymicity
IS2(x)
IS2(x)
x |
|
IS
Junrui Di et al. Joint and individual representation of domains of physical activity, sleep, and circadian rhythmicity. Statistics in Biosciences.
This function calcualte intradaily variability, a nonparametric metric
reprsenting fragmentation of circadian rhtymicity. This function is a whole dataset
wrapper for IV
.
IV_long2(count.data, window = 1, method = c("average", "sum"))
IV_long2(count.data, window = 1, method = c("average", "sum"))
count.data |
|
window |
an |
method |
|
A data.frame
with the following 5 columns
ID |
ID |
Day |
Day |
IV |
IV |
Junrui Di et al. Joint and individual representation of domains of physical activity, sleep, and circadian rhythmicity. Statistics in Biosciences.
This function calcualte intradaily variability, a nonparametric metric reprsenting fragmentation of circadian rhtymicity
IV2(x)
IV2(x)
x |
|
IV
Junrui Di et al. Joint and individual representation of domains of physical activity, sleep, and circadian rhythmicity. Statistics in Biosciences.
Replace SVDmiss by SVDmiss2 in the function
jive.predict2(data.new, jive.output)
jive.predict2(data.new, jive.output)
data.new |
|
jive.output |
|
See jive.predict(package:r.jive) for details.
See r.jive:: jive.predict for details
Make a sleep matrix (sleep=1 and wake=0) based on the sleep onset and wake up time for the purpose of calculating physical acitivy features during wake up time.
makeSleepDataMatrix(sleepFN, epochOut = 60, impute = TRUE, outputFN)
makeSleepDataMatrix(sleepFN, epochOut = 60, impute = TRUE, outputFN)
sleepFN |
|
epochOut |
|
impute |
|
outputFN |
|
Sleep matrix and messages of sleep data.
duplicatedays |
Duplicate days of sleep data if exists |
sleepproblem |
Invalid sleep data if exists |
sleep matrix (0/1) |
write the sleep matrix to a csv file specified by outputFN |
This R script will generate all necessary R/Rmd/shell files for data processing after running GGIR for accelerometer data.
mMARCH.AC.maincall( mode, useIDs.FN = NULL, currentdir, studyname, bindir = NULL, outputdir, epochIn = 5, epochOut = 60, log.multiplier = 9250, use.cluster = TRUE, QCdays.alpha = 7, QChours.alpha = 16, QCnights.feature.alpha = c(0, 0, 0, 0), DoubleHour = c("average", "earlier", "later")[1], QC.sleepdur.avg = c(3, 12), QC.nblocks.sleep.avg = c(6, 29), Rversion = "R", filename2id = NULL, PA.threshold = c(40, 100, 400), PA.threshold2 = c(50, 100, 400), desiredtz = "US/Eastern", RemoveDaySleeper = FALSE, part5FN = "WW_L50M100V400_T5A5", NfileEachBundle = 20, holidayFN = NULL, trace = FALSE )
mMARCH.AC.maincall( mode, useIDs.FN = NULL, currentdir, studyname, bindir = NULL, outputdir, epochIn = 5, epochOut = 60, log.multiplier = 9250, use.cluster = TRUE, QCdays.alpha = 7, QChours.alpha = 16, QCnights.feature.alpha = c(0, 0, 0, 0), DoubleHour = c("average", "earlier", "later")[1], QC.sleepdur.avg = c(3, 12), QC.nblocks.sleep.avg = c(6, 29), Rversion = "R", filename2id = NULL, PA.threshold = c(40, 100, 400), PA.threshold2 = c(50, 100, 400), desiredtz = "US/Eastern", RemoveDaySleeper = FALSE, part5FN = "WW_L50M100V400_T5A5", NfileEachBundle = 20, holidayFN = NULL, trace = FALSE )
mode |
|
useIDs.FN |
|
currentdir |
|
studyname |
|
bindir |
|
outputdir |
|
epochIn |
|
epochOut |
|
log.multiplier |
|
use.cluster |
|
QCdays.alpha |
|
QChours.alpha |
|
QCnights.feature.alpha |
|
DoubleHour |
|
QC.sleepdur.avg |
|
QC.nblocks.sleep.avg |
|
Rversion |
|
filename2id |
|
PA.threshold |
|
PA.threshold2 |
|
desiredtz |
|
RemoveDaySleeper |
|
part5FN |
|
NfileEachBundle |
|
holidayFN |
|
trace |
|
See mMARCH.AC manual for details.
This function is a whole dataset wrapper for Time
PAfun(count.data, weartime, PA.threshold = c(50, 100, 400))
PAfun(count.data, weartime, PA.threshold = c(50, 100, 400))
count.data |
|
weartime |
|
PA.threshold |
threshold to calculate the time in minutes of sedentary, light, moderate and vigorous activity the data. |
A dataframe with some of the following columns
ID |
identifier of the person |
Day |
indicator of which day of activity it is, can be a numeric vector of sequence 1,2,... or a string of date |
time |
time of certain state |
This R script will generate plot for each variable and write description to a log file.
pheno.plot( inputFN, outFN = paste("plot_", inputFN, ".pdf", sep = ""), csv = TRUE, sep = " ", start = 3, read = TRUE, logFN = NULL, track = TRUE )
pheno.plot( inputFN, outFN = paste("plot_", inputFN, ".pdf", sep = ""), csv = TRUE, sep = " ", start = 3, read = TRUE, logFN = NULL, track = TRUE )
inputFN |
|
outFN |
|
csv |
|
sep |
|
start |
|
read |
|
logFN |
|
track |
|
Files were written to the current directory. One is .pdf file for plots and the other is .log file for variable description.
This function calcualte relative amplitude, a nonparametric metric
of circadian rhtymicity. This function is a whole dataset
wrapper for RA
.
RA_long2( count.data, window = 1, method = c("average", "sum"), noon2noon = FALSE )
RA_long2( count.data, window = 1, method = c("average", "sum"), noon2noon = FALSE )
count.data |
|
window |
since the caculation of M10 and L5 depends on the dimension of data, we need to include
window size as an argument. This function is a whole dataset
wrapper for |
method |
|
noon2noon |
|
A data.frame
with the following 3 columns
ID |
ID |
Day |
Day |
RA |
RA |
This function calcualte relative amplitude, a nonparametric metric reprsenting fragmentation of circadian rhtymicity
RA2(x, window = 1, method = c("average", "sum"), noon2noon = FALSE)
RA2(x, window = 1, method = c("average", "sum"), noon2noon = FALSE)
x |
|
window |
since the caculation of M10 and L5 depends on the dimension of data, we need to include window size as an argument. |
method |
|
noon2noon |
|
RA
Junrui Di et al. Joint and individual representation of domains of physical activity, sleep, and circadian rhythmicity. Statistics in Biosciences.
Modify ncomp = min(ncol(X),nrow(X),ncomp) for the matrix with nrow(X)<ncol(X)
SVDmiss2(X, niter = 200, ncomp = dim(X)[2], conv.reldiff = 0.001)
SVDmiss2(X, niter = 200, ncomp = dim(X)[2], conv.reldiff = 0.001)
X |
|
niter |
|
ncomp |
|
conv.reldiff |
|
See SVDmiss(package:SpatioTemporal) for details.
See SpatioTemporal:: SVDmiss for details
This function is a whole dataset wrapper for Time
Time_long2(count.data, weartime, thresh, smallerthan = TRUE, bout.length = 1)
Time_long2(count.data, weartime, thresh, smallerthan = TRUE, bout.length = 1)
count.data |
|
weartime |
|
thresh |
threshold to binarize the data. |
smallerthan |
Find a state that is smaller than a threshold, or greater than or equal to. |
bout.length |
minimum duration of defining an active bout; defaults to 1. |
A dataframe with some of the following columns
ID |
identifier of the person |
Day |
indicator of which day of activity it is, can be a numeric vector of sequence 1,2,... or a string of date |
time |
time of certain state |
Calculate the total time of being in certain state, e.g. sedentary, active, MVPA, etc.
Time2(x, w, thresh, smallerthan = TRUE, bout.length = 1)
Time2(x, w, thresh, smallerthan = TRUE, bout.length = 1)
x |
|
w |
|
thresh |
threshold to binarize the data. |
smallerthan |
Find a state that is smaller than a threshold, or greater than or equal to. |
bout.length |
minimum duration of defining an active bout; defaults to 1. |
Time
Calculate total volume of activity level, which includes
TLAC
(total log transfored activity counts),
TAC
(total activity counts).
Tvol2(count.data, weartime, logtransform = FALSE, log.multiplier = 9250)
Tvol2(count.data, weartime, logtransform = FALSE, log.multiplier = 9250)
count.data |
|
weartime |
|
logtransform |
if |
log.multiplier |
|
log transormation is defined as log(x+1).
A dataframe with some of the following columns
ID |
identifier of the person |
Day |
indicator of which day of activity it is, can be a numeric vector of sequence 1,2,... or a string of date |
TAC |
total activity count |
TLAC |
total log activity count |
Determine during which time period, subject should wear the device. It is preferable that user provide their own wear/non wear flag which should has the same dimension as the activity data. This function provide wear/non wear flag based on time of day.
wear_flag(count.data, start = "05:00", end = "23:00")
wear_flag(count.data, start = "05:00", end = "23:00")
count.data |
|
start |
start time, a string in the format of 24hr, e.g. "05:00"; defaults to "05:00". |
end |
end time, a string in the format of 24hr, e.g. "23:00"; defaults to "23:00" |
Fragmentation metrics are usually defined when subject is awake. The weartime
provide time periods on which those features should be extracted.
This can be also used as indication of wake/sleep.
A data.frame
with same dimension and column name as the count.data
, with 0/1 as the elments
reprensting wear, nonwear respectively.