Title: | Monitoring a Developing Pandemic with Available Data |
---|---|
Description: | Full dynamic system to describe and forecast the spread and the severity of a developing pandemic, based on available data. These data are number of infections, hospitalizations, deaths and recoveries notified each day. The system consists of three transitions, infection-infection, infection-hospital and hospital-death/recovery. The intensities of these transitions are dynamic and estimated using non-parametric local linear estimators. The package can be used to provide forecasts and survival indicators such as the median time spent in hospital and the probability that a patient who has been in hospital for a number of days can leave it alive. Methods are described in Gámiz, Mammen, Martínez-Miranda, and Nielsen (2024) <doi:10.48550/arXiv.2308.09918> and <doi:10.48550/arXiv.2308.09919>. |
Authors: | María Luz Gámiz [aut, cph], Enno Mammen [aut, cph], María Dolores Martínez-Miranda [aut, cre, cph], Jens Perch Nielsen [aut, cph] |
Maintainer: | María Dolores Martínez-Miranda <[email protected]> |
License: | GPL-2 |
Version: | 0.1.0 |
Built: | 2024-12-06 06:48:29 UTC |
Source: | CRAN |
Full dynamic system to describe and forecast the spread and the severity of a developing pandemic based on typically available data. The system involves three different transitions: infection-infection, infection-hospitalization and hospitalization-death/recovery. The intensities of these transitions are dynamic and estimated using nonparametric local linear estimators.
Estimation is performed from aggregated data consisting of number of infections (positive tests), hospitalizations, deaths and recoveries notified each day.
The package can be used to provide forecasts of new infections, hospitalizations and deaths, as well as typical survival indicators such as the median time from admission in hospital to recovery or death, or the probability that a patient who has been in hospital for a number of days can leave it alive.
Package: | pandemics |
Type: | Package |
Version: | 0.1.0 |
Date: | 2024-30-08 |
License: | GPL-2 |
M.L. Gamiz, E. Mammen, M.D. Martinez-Miranda and J.P. Nielsen
Maintainer: Maria Dolores Martinez-Miranda <[email protected]>
Gámiz, M.L., Mammen, E., Martínez-Miranda, M.D. and Nielsen, J.P. (2022). Missing link survival analysis with applications to available pandemic data. Computational Statistics & Data Analysis, 169, 107405.
Gámiz, M.L., Mammen, E., Martínez-Miranda, M.D. and Nielsen, J.P. (2024a). Low quality exposure and point processes with a view to the first phase of a pandemic. arXiv:2308.09918.
Gámiz, M.L., Mammen, E., Martínez-Miranda, M.D. and Nielsen, J.P. (2024b). Monitoring a developing pandemic with available data. arXiv:2308.09919.
## Forecasting new infections in October 2020 from data up to 30-Sep data('covid') ## We remove the first 56 rows (no data on testing until 13th May) covid2<-covid[-c(1:56),] M2<-nrow(covid2) Hi<-covid2$Hospi Hi<-Hi[-1] Ri<-covid2$Recov Ri<-diff(Ri) Di<-covid2$Death Di<-diff(Di) Pi<-covid2$Posit M2<-length(Di) # New hospitalizations are Hi-Ri-Di newHi<-Hi[-1]-(Hi[-M2]-Ri[-M2]-Di[-M2]) newHi<-as.integer(c(Hi[1],newHi)) newHi[newHi<0]<-0 ddates<-covid2$Date ## 1. First estimate the infection rate Ms<-141 # up to 30-Sep Ei.new<-Pi[1:Ms] delay<-1;Msd<-Ms-delay Oi.z<-Ei.new[-(1:delay)] Ei.z1<-Ei.new[1:Msd]; t.grid<-z.grid<-1:Msd bs<-t(c(5,10)) RInf<-rate2Dmiss(t.grid,z.grid,Oi.z,Ei.z1,bs.grid=bs,cv=FALSE) RoInf<-RInf$hi.zt ## 2. Forecasting now with a given infection indicator Cval<-1.5 period<-32 # forecasts up to 1st November, 32 days fore<-forecasting(Cval,period,RoInf,Pz=Pi[1:Ms], newHz=newHi[1:Ms],Hz=Hi[1:Ms],Dz=Di[1:Ms],Rz=Ri[1:Ms]) Pz.pred<-fore$Pz.pred ## 3. Plot forecasts and compare with observed values ## (future values are shown for predictions validation) Pz.obs<-Pi yy<-range(Pz.pred,Pz.obs,na.rm=TRUE) plot(1:(Ms-1),Pz.obs[3:(Ms+1)],ylab='',xlab='Date of notification', main='Forecasts of new positives in October 2020',pch=19, ylim=yy,xaxt='n',xlim=c(1,Ms+1+period)) oat<-c(1,17,32,47,62,78,93,109,124,139,154,170) olab<-ddates[oat] axis(1,at=oat,labels=olab) points(Ms:(Ms+period-2),Pz.obs[(Ms+2):(Ms+period)],col=1,pch=1) lines(Ms:(Ms+period-2),Pz.pred[-1],col=2,lty=2,lwd=3) legend('topleft',c('Data: daily number of new positives until 30-Sep', paste('Forecasts with Cval=',round(Cval,2)), 'True numbers of new positives in October'), pch=c(19,NA,1),col=c(1,2,1),lty=c(NA,2,NA),lwd=c(NA,3,NA),bty='n')
## Forecasting new infections in October 2020 from data up to 30-Sep data('covid') ## We remove the first 56 rows (no data on testing until 13th May) covid2<-covid[-c(1:56),] M2<-nrow(covid2) Hi<-covid2$Hospi Hi<-Hi[-1] Ri<-covid2$Recov Ri<-diff(Ri) Di<-covid2$Death Di<-diff(Di) Pi<-covid2$Posit M2<-length(Di) # New hospitalizations are Hi-Ri-Di newHi<-Hi[-1]-(Hi[-M2]-Ri[-M2]-Di[-M2]) newHi<-as.integer(c(Hi[1],newHi)) newHi[newHi<0]<-0 ddates<-covid2$Date ## 1. First estimate the infection rate Ms<-141 # up to 30-Sep Ei.new<-Pi[1:Ms] delay<-1;Msd<-Ms-delay Oi.z<-Ei.new[-(1:delay)] Ei.z1<-Ei.new[1:Msd]; t.grid<-z.grid<-1:Msd bs<-t(c(5,10)) RInf<-rate2Dmiss(t.grid,z.grid,Oi.z,Ei.z1,bs.grid=bs,cv=FALSE) RoInf<-RInf$hi.zt ## 2. Forecasting now with a given infection indicator Cval<-1.5 period<-32 # forecasts up to 1st November, 32 days fore<-forecasting(Cval,period,RoInf,Pz=Pi[1:Ms], newHz=newHi[1:Ms],Hz=Hi[1:Ms],Dz=Di[1:Ms],Rz=Ri[1:Ms]) Pz.pred<-fore$Pz.pred ## 3. Plot forecasts and compare with observed values ## (future values are shown for predictions validation) Pz.obs<-Pi yy<-range(Pz.pred,Pz.obs,na.rm=TRUE) plot(1:(Ms-1),Pz.obs[3:(Ms+1)],ylab='',xlab='Date of notification', main='Forecasts of new positives in October 2020',pch=19, ylim=yy,xaxt='n',xlim=c(1,Ms+1+period)) oat<-c(1,17,32,47,62,78,93,109,124,139,154,170) olab<-ddates[oat] axis(1,at=oat,labels=olab) points(Ms:(Ms+period-2),Pz.obs[(Ms+2):(Ms+period)],col=1,pch=1) lines(Ms:(Ms+period-2),Pz.pred[-1],col=2,lty=2,lwd=3) legend('topleft',c('Data: daily number of new positives until 30-Sep', paste('Forecasts with Cval=',round(Cval,2)), 'True numbers of new positives in October'), pch=c(19,NA,1),col=c(1,2,1),lty=c(NA,2,NA),lwd=c(NA,3,NA),bty='n')
Daily data of COVID-19 cases (positive-tested), hospitalizations, deaths and recovered published in the open platform for French public data during the period 2020-03-18 to 2020-11-01.
data("covid")
data("covid")
Data frame with 229 observations and 5 variables:
[,1] Date
Notification date
[,2] Hospi
Daily total number of persons in hospital
[,3] Death
Daily total number of deaths (in hospital)
[,4] Recov
Daily total number of persons discharged from hospital
[,5] Posit
Daily total number of persons tested positive
Data were downloaded from https://www.data.gouv.fr/es/datasets/ on 5th of January 2022.
Gámiz, M.L., Mammen, E., Martínez-Miranda, M.D. and Nielsen, J.P. (2024a). Low quality exposure and point processes with a view to the first phase of a pandemic. arXiv:2308.09918.
Gámiz, M.L., Mammen, E., Martínez-Miranda, M.D. and Nielsen, J.P. (2024b). Monitoring a developing pandemic with available data. arXiv:2308.09919.
data("covid") Hi<-covid$Hospi Hi<-Hi[-1] Ri<-covid$Recov Ri<-diff(Ri) Di<-covid$Death Di<-diff(Di) M<-length(Di) ## New hospitalizations are Hi-Ri-Di newHi<-Hi[-1]-(Hi[-M]-Ri[-M]-Di[-M]) newHi<-c(Hi[1],newHi) newHi[newHi<0]<-0 # (inconsistencies in the data) plot(covid$Date[-1],newHi,xlab='Notification date',ylab='Number of persons', main='New hospitalizations',type='l')
data("covid") Hi<-covid$Hospi Hi<-Hi[-1] Ri<-covid$Recov Ri<-diff(Ri) Di<-covid$Death Di<-diff(Di) M<-length(Di) ## New hospitalizations are Hi-Ri-Di newHi<-Hi[-1]-(Hi[-M]-Ri[-M]-Di[-M]) newHi<-c(Hi[1],newHi) newHi[newHi<0]<-0 # (inconsistencies in the data) plot(covid$Date[-1],newHi,xlab='Notification date',ylab='Number of persons', main='New hospitalizations',type='l')
Forecasting infections, hospitalizations and deaths in hospital. From estimated two-dimensional rates of infections and hospilizations and estimated hazards of deaths and recoveries, new infections, hospitalizations and deaths are predicted in a forecasting period.
forecasting(Cval=1, period, RoInf, RoHosp, RoDeath, RoRec, Pz, newHz, Hz, Dz, Rz)
forecasting(Cval=1, period, RoInf, RoHosp, RoDeath, RoRec, Pz, newHz, Hz, Dz, Rz)
Cval |
a numeric value with the infection indicator. Default value is |
period |
an integer value with the number of days to forecast. |
RoInf |
a matrix with the estimated rate of infection ( |
RoHosp |
(optional)
a matrix with the estimated rate of hospitalization ( |
RoDeath |
(optional)
a matrix with the estimated hazard of deaths ( |
RoRec |
(optional)
a matrix with the estimated hazard of recoveries ( |
Pz |
a vector with the observed (past) number of people tested positive each day (length |
newHz |
a vector with the observed number of new hospitalizations each day (length |
Hz |
a vector with the observed total number people in hospitals each day (length |
Dz |
a vector with the observed number of deaths each day (length |
Rz |
a vector with the observed number of recoveries each day (length |
To create estimated rates and hazards for the arguments evaluate the functions hazard2Dmiss
and rate2Dmiss
. If the argument RoHosp
is missing then only infections are predicted. If any of the argument RoDeath
or RoRec
is missing then deaths (recoveries) are not predicted.
The infection indicator Cval
is typically provided by experts. At the more recent observation (t), it indicates whether the future (t+h, h > 0) will be different or equal to the immediate past. If Cval=1
then we can forecast the immediate future based on the immediate past. There might be periods where little is happening (and the indicator might have a tendency
to increase slowly) and there might be few but very important change-points where measures (e.g. lock down) are introduced to minimize future infections (and the indicator might drop dramatically in a matter of days). See Gámiz et al. (2024a) for more details and the relationship to the well-known reproduction number.
Pz.fitted |
a vector with the fitted values for the number of infections in the past. |
Pz.pred |
a vector with the predicted values for the number of infections in the forecasting period. |
newHz.fitted |
a vector with the fitted values for the number of hospitalizations in the past. |
newHz.pred |
a vector with the predicted values for the number of hospitalizations in the forecasting period. |
Dz.fitted |
a vector with the fitted values for the number of deaths in the past. |
Dz.pred |
a vector with the predicted values for the number of deaths in the forecasting period. |
Rz.fitted |
a vector with the fitted values for the number of recoveries in the past. |
Rz.pred |
a vector with the predicted values for the number of recoveries in the forecasting period. |
M.L. Gámiz, E. Mammen, M.D. Martínez-Miranda and J.P. Nielsen.
Gámiz, M.L., Mammen, E., Martínez-Miranda, M.D. and Nielsen, J.P. (2024a). Low quality exposure and point processes with a view to the first phase of a pandemic. arXiv:2308.09918.
Gámiz, M.L., Mammen, E., Martínez-Miranda, M.D. and Nielsen, J.P. (2024b). Monitoring a developing pandemic with available data. arXiv:2308.09919.
## Forecasting October 2020 from past data data('covid') ## We remove the first 56 rows(no data on testing until 13th May) covid2<-covid[-c(1:56),] M2<-nrow(covid2) Hi<-covid2$Hospi Hi<-Hi[-1] Ri<-covid2$Recov Ri<-diff(Ri) Di<-covid2$Death Di<-diff(Di) Pi<-covid2$Posit M2<-length(Di) # New hospitalizations are Hi-Ri-Di newHi<-Hi[-1]-(Hi[-M2]-Ri[-M2]-Di[-M2]) newHi<-as.integer(c(Hi[1],newHi)) newHi[newHi<0]<-0 ddates<-covid2$Date ## 1. First we estimate the death and recovery hazards, ## and the infection and hospitalization rates. ## Estimation using data up to 30-Sep-2020 Ms<-141 # up to 30-Sep ## 1.1. Infection rate Ei.new<-Pi[1:Ms] delay<-1;Msd<-Ms-delay Oi.z<-Ei.new[-(1:delay)] Ei.z1<-Ei.new[1:Msd]; t.grid<-z.grid<-1:Msd bs<-t(c(5,10)) RInf<-rate2Dmiss(t.grid,z.grid,Oi.z,Ei.z1, bs.grid=bs,cv=FALSE) RoInf<-RInf$hi.zt ## 1.2. Hospitalization rate Oi.z<-newHi[1:Ms] # The exposure are the positives data Ei.z1<-Pi[1:Ms] t.grid<-z.grid<-1:Ms bs<-t(c(150,150)) RHosp<-rate2Dmiss(t.grid,z.grid,Oi.z,Ei.z1, bs.grid=bs,cv=FALSE) RoHosp<-matrix(as.numeric(RHosp$hi.zt),Ms,Ms) ## 1.3. Hazards of deaths and recoveries Oi1.z<-Di[1:Ms] # deaths Oi2.z<-Ri[1:Ms] # recoveries Ei.z<-Hi[1:Ms] # exposure is left as cumulative t.grid<-z.grid<-1:Ms bs<-t(c(150,40)) res.h<-hazard2Dmiss(t.grid,z.grid,Oi1.z,Oi2.z,Ei.z, bs.grid=bs,cv=FALSE) RoDeath<-as.matrix(res.h$hi1.zt,Ms,Ms) ## 2D-hazard of deaths RoRec<-as.matrix(res.h$hi2.zt,Ms,Ms) ## 2D-hazard of recoveries ## 2. Forecasting with a given infection indicator Cval<-1.5 period<-32 # forecasts up to 1st November, 32 days fore<-forecasting(Cval,period,RoInf,RoHosp,RoDeath,RoRec, Pi[1:Ms],newHi[1:Ms],Hi[1:Ms],Di[1:Ms],Ri[1:Ms]) Hz.pred<-fore$newHz.pred Pz.pred<-fore$Pz.pred Dz.pred<-fore$Dz.pred ## 3. Plot forecasts and compare with observed values ## (future values are shown for predictions validation) Pz.obs<-Pi Hz.obs<-newHi Dz.obs<-Di ## Graph with the new infections up to 30-Sep and forecasts yy<-range(Pz.pred,Pz.obs,na.rm=TRUE) plot(1:(Ms-1),Pz.obs[3:(Ms+1)],ylab='',xlab='Date of notification', main='Forecasts of new positives in October 2020',pch=19, ylim=yy,xaxt='n',xlim=c(1,Ms+1+period)) oat<-c(1,17,32,47,62,78,93,109,124,139,154,170) olab<-ddates[oat] axis(1,at=oat,labels=olab) ## forecasts start on 30-sep but we only plot from 1-October points(Ms:(Ms+period-2),Pz.obs[(Ms+2):(Ms+period)],col=1,pch=1) lines(Ms:(Ms+period-2),Pz.pred[-1],col=2,lty=2,lwd=3) legend('topleft',c('Data: daily number of new positives until 30-Sep', paste('Forecasts with Cval=',round(Cval,2)), 'True numbers of new positives in October'), pch=c(19,NA,1),col=c(1,2,1),lty=c(NA,2,NA),lwd=c(NA,3,NA),bty='n') ## Graph with the new hospitalizations up to 30-Sep and forecasts ylim1<-range(Hz.pred,Hz.obs[-(1:2)],na.rm=TRUE) plot(1:(Ms-1),Hz.obs[3:(Ms+1)],ylab='',xlab='Date of admission', main='Forecasts of new hospitalizations in October 2020', pch=19, ylim=ylim1,xaxt='n',xlim=c(1,Ms+1+period)) oat<-c(1,17,32,47,62,78,93,109,124,139,154,170) olab<-ddates[oat] axis(1,at=oat,labels=olab) points(Ms:(Ms+period-2),Hz.obs[(Ms+2):(Ms+period)]) lines(Ms:(Ms+period-2),Hz.pred[-1],col=2,lty=2,lwd=2) legend('topleft',c('Data: daily number of new hospitalizations until 30-Sep', paste('Forecasts with Cval=',round(Cval,2)), 'True numbers of new hospitalizations in October'), pch=c(19,NA,1),col=c(1,2,1),lty=c(NA,2,NA),lwd=c(NA,3,NA),bty='n') ## Graph with deaths up to 30-Sep and forecasts ylim1<-range(Dz.pred,Dz.obs[-(1:2)],na.rm=TRUE) plot(1:(Ms-1),Dz.obs[3:(Ms+1)],ylab='',xlab='Date of admission', main='Forecasts of deaths in October 2020', pch=19,ylim=ylim1,xaxt='n',xlim=c(1,Ms+1+period)) oat<-c(1,17,32,47,62,78,93,109,124,139,154,170) olab<-ddates[oat] axis(1,at=oat,labels=olab) points(Ms:(Ms+period-2),Dz.obs[(Ms+2):(Ms+period)]) lines(Ms:(Ms+period-2),Dz.pred[-1],col=2,lty=2,lwd=2) legend('topleft',c('Data: daily number of deaths until 30-Sep', paste('Forecasts with Cval=',round(Cval,2)), 'True numbers of deaths in October'), pch=c(19,NA,1),col=c(1,2,1),lty=c(NA,2,NA), lwd=c(NA,3,NA),bty='n')
## Forecasting October 2020 from past data data('covid') ## We remove the first 56 rows(no data on testing until 13th May) covid2<-covid[-c(1:56),] M2<-nrow(covid2) Hi<-covid2$Hospi Hi<-Hi[-1] Ri<-covid2$Recov Ri<-diff(Ri) Di<-covid2$Death Di<-diff(Di) Pi<-covid2$Posit M2<-length(Di) # New hospitalizations are Hi-Ri-Di newHi<-Hi[-1]-(Hi[-M2]-Ri[-M2]-Di[-M2]) newHi<-as.integer(c(Hi[1],newHi)) newHi[newHi<0]<-0 ddates<-covid2$Date ## 1. First we estimate the death and recovery hazards, ## and the infection and hospitalization rates. ## Estimation using data up to 30-Sep-2020 Ms<-141 # up to 30-Sep ## 1.1. Infection rate Ei.new<-Pi[1:Ms] delay<-1;Msd<-Ms-delay Oi.z<-Ei.new[-(1:delay)] Ei.z1<-Ei.new[1:Msd]; t.grid<-z.grid<-1:Msd bs<-t(c(5,10)) RInf<-rate2Dmiss(t.grid,z.grid,Oi.z,Ei.z1, bs.grid=bs,cv=FALSE) RoInf<-RInf$hi.zt ## 1.2. Hospitalization rate Oi.z<-newHi[1:Ms] # The exposure are the positives data Ei.z1<-Pi[1:Ms] t.grid<-z.grid<-1:Ms bs<-t(c(150,150)) RHosp<-rate2Dmiss(t.grid,z.grid,Oi.z,Ei.z1, bs.grid=bs,cv=FALSE) RoHosp<-matrix(as.numeric(RHosp$hi.zt),Ms,Ms) ## 1.3. Hazards of deaths and recoveries Oi1.z<-Di[1:Ms] # deaths Oi2.z<-Ri[1:Ms] # recoveries Ei.z<-Hi[1:Ms] # exposure is left as cumulative t.grid<-z.grid<-1:Ms bs<-t(c(150,40)) res.h<-hazard2Dmiss(t.grid,z.grid,Oi1.z,Oi2.z,Ei.z, bs.grid=bs,cv=FALSE) RoDeath<-as.matrix(res.h$hi1.zt,Ms,Ms) ## 2D-hazard of deaths RoRec<-as.matrix(res.h$hi2.zt,Ms,Ms) ## 2D-hazard of recoveries ## 2. Forecasting with a given infection indicator Cval<-1.5 period<-32 # forecasts up to 1st November, 32 days fore<-forecasting(Cval,period,RoInf,RoHosp,RoDeath,RoRec, Pi[1:Ms],newHi[1:Ms],Hi[1:Ms],Di[1:Ms],Ri[1:Ms]) Hz.pred<-fore$newHz.pred Pz.pred<-fore$Pz.pred Dz.pred<-fore$Dz.pred ## 3. Plot forecasts and compare with observed values ## (future values are shown for predictions validation) Pz.obs<-Pi Hz.obs<-newHi Dz.obs<-Di ## Graph with the new infections up to 30-Sep and forecasts yy<-range(Pz.pred,Pz.obs,na.rm=TRUE) plot(1:(Ms-1),Pz.obs[3:(Ms+1)],ylab='',xlab='Date of notification', main='Forecasts of new positives in October 2020',pch=19, ylim=yy,xaxt='n',xlim=c(1,Ms+1+period)) oat<-c(1,17,32,47,62,78,93,109,124,139,154,170) olab<-ddates[oat] axis(1,at=oat,labels=olab) ## forecasts start on 30-sep but we only plot from 1-October points(Ms:(Ms+period-2),Pz.obs[(Ms+2):(Ms+period)],col=1,pch=1) lines(Ms:(Ms+period-2),Pz.pred[-1],col=2,lty=2,lwd=3) legend('topleft',c('Data: daily number of new positives until 30-Sep', paste('Forecasts with Cval=',round(Cval,2)), 'True numbers of new positives in October'), pch=c(19,NA,1),col=c(1,2,1),lty=c(NA,2,NA),lwd=c(NA,3,NA),bty='n') ## Graph with the new hospitalizations up to 30-Sep and forecasts ylim1<-range(Hz.pred,Hz.obs[-(1:2)],na.rm=TRUE) plot(1:(Ms-1),Hz.obs[3:(Ms+1)],ylab='',xlab='Date of admission', main='Forecasts of new hospitalizations in October 2020', pch=19, ylim=ylim1,xaxt='n',xlim=c(1,Ms+1+period)) oat<-c(1,17,32,47,62,78,93,109,124,139,154,170) olab<-ddates[oat] axis(1,at=oat,labels=olab) points(Ms:(Ms+period-2),Hz.obs[(Ms+2):(Ms+period)]) lines(Ms:(Ms+period-2),Hz.pred[-1],col=2,lty=2,lwd=2) legend('topleft',c('Data: daily number of new hospitalizations until 30-Sep', paste('Forecasts with Cval=',round(Cval,2)), 'True numbers of new hospitalizations in October'), pch=c(19,NA,1),col=c(1,2,1),lty=c(NA,2,NA),lwd=c(NA,3,NA),bty='n') ## Graph with deaths up to 30-Sep and forecasts ylim1<-range(Dz.pred,Dz.obs[-(1:2)],na.rm=TRUE) plot(1:(Ms-1),Dz.obs[3:(Ms+1)],ylab='',xlab='Date of admission', main='Forecasts of deaths in October 2020', pch=19,ylim=ylim1,xaxt='n',xlim=c(1,Ms+1+period)) oat<-c(1,17,32,47,62,78,93,109,124,139,154,170) olab<-ddates[oat] axis(1,at=oat,labels=olab) points(Ms:(Ms+period-2),Dz.obs[(Ms+2):(Ms+period)]) lines(Ms:(Ms+period-2),Dz.pred[-1],col=2,lty=2,lwd=2) legend('topleft',c('Data: daily number of deaths until 30-Sep', paste('Forecasts with Cval=',round(Cval,2)), 'True numbers of deaths in October'), pch=c(19,NA,1),col=c(1,2,1),lty=c(NA,2,NA), lwd=c(NA,3,NA),bty='n')
Local linear estimator of the marker-dependent hazard introduced by Nielsen (1998). The implementation considers two dimensions only: a one-dimensional marker and time (duration). It assumes aggregated data in the form of occurrences and exposure. The bandwidth can be provided or estimated using cross-validation (see details below).
hazard2D(t.grid, z.grid, o.zt, e.zt, bs.grid, cv=FALSE)
hazard2D(t.grid, z.grid, o.zt, e.zt, bs.grid, cv=FALSE)
t.grid |
a vector of |
z.grid |
a vector of |
o.zt |
a matrix of occurrences ( |
e.zt |
a matrix of exposures ( |
bs.grid |
a matrix with a grid of 2-dimensional bandwidths (by rows). |
cv |
logical, if |
The marker-dependent hazard local linear estimator was introduced by Nielsen (1998) and bandwidth selection provided by Gámiz et al. (2013). It assumes the Aalen multiplicative intensity model. The function implements such estimator in the case of a one-dimensional marker only (first dimension). Data are assumed to be aggregated in the form of occurrences and exposure (see Gámiz et al. 2013) in a two-dimensional grid of values (z,t).
The estimator involves on a two-dimensional kernel function and a two-dimensional bandwidth. The implemented kernel is multiplicative with K(u)=(3003/2048)*(1-(u)^2)^6)*(abs(u)<1)
. The bandwidth can be provided in the argument bs.grid
in the form of a matrix (2
times 1
). Data-driven badwidth selection is also supported. If cv=TRUE
(default) then the bandwith is estimated using cross-validation from a grid of nb
two-dimensional bandwidths provided in bs.grid
(nb
times 2
).
This marker-dependent hazard local linear estimator assumes that full information is available in the form of occurrences and exposures. In a emergent or developing pandemic, this estimator will most likely be infeasible from typically available data. Thus, the estimator will be computed after the necessary information is estimated through the function hazard2Dmiss()
. See Gámiz et al. (2022,2024a,b) for more details.
hi.zt |
A ( |
bcv |
A two-dimensional vector with the bandwidth used to compute the estimator (estimated by cross-validation if |
Infeasible estimator to be evaluated through the function hazard2Dmiss()
.
M.L. Gámiz, E. Mammen, M.D. Martínez-Miranda and J.P. Nielsen.
Gámiz, M.L., Janys, L., Martínez-Miranda, M.D. and Nielsen, J.P. (2013). Bandwidth selection in marker dependent kernel hazard estimation. Computational Statistics & Data Analysis, 68, 155–169.
Gámiz, M.L., Mammen, E., Martínez-Miranda, M.D. and Nielsen, J.P. (2022). Missing link survival analysis with applications to available pandemic data. Computational Statistics & Data Analysis, 169, 107405.
Gámiz, M.L., Mammen, E., Martínez-Miranda, M.D. and Nielsen, J.P. (2024a). Low quality exposure and point processes with a view to the first phase of a pandemic. arXiv:2308.09918.
Gámiz, M.L., Mammen, E., Martínez-Miranda, M.D. and Nielsen, J.P. (2024b). Monitoring a developing pandemic with available data. arXiv:2308.09919.
Nielsen, J.P. (1998). Marker dependent kernel estimation from local linear estimation. Scandinavian Actuarial Journal, 2, 113-124.
## 1. Define a true 2D-hazard evaluated at M*M grid points M<-100 alpha<-dbeta((1:M)/(M+1),2,2)/(M+1) alpha.zt<-matrix(NA,M,M) for(z in 1:M) for (t in 1:(M-z+1)) alpha.zt[z,t]<-alpha[z]*alpha[t] ## 2. Simulate data from the true hazard (Aalen multiplicative model) N<-10000 # sample size set.seed(1) ## simulate new arrivals Ei.new<-hist(sort(runif(n=N,max=M)),breaks=0:M,plot=FALSE)$counts ## simulate matrices of exposure (e.zt) and occurrences (o.zt) o.zt<-e.zt<-matrix(0,M,M) e.zt[,1]<-as.integer(Ei.new) for(z in 1:M) for (t in 1:(M-z+1)){ if(e.zt[z,t]>0) o.zt[z,t]<-rbinom(1,as.integer(e.zt[z,t]),alpha.zt[z,t]) else o.zt[z,t]<-0 if(t<(M-z+1)) e.zt[z,(t+1)]<-e.zt[z,t]-o.zt[z,t] } ## 3. Estimate the 2d-hazard with fixed bandwidth bs.grid<-t(c(M/2,M/2)) alpha.estim<-hazard2D(1:M,1:M,o.zt,e.zt,bs.grid,cv=FALSE) ## 4. Compare true and estimated hazards persp(1:M,1:M,alpha.zt,main='True hazard',theta=60,xlab='z',ylab='t',zlab='') persp(1:M,1:M,alpha.estim$hi.zt,main='Estimated hazard',theta=60, xlab='z',ylab='t',zlab='')
## 1. Define a true 2D-hazard evaluated at M*M grid points M<-100 alpha<-dbeta((1:M)/(M+1),2,2)/(M+1) alpha.zt<-matrix(NA,M,M) for(z in 1:M) for (t in 1:(M-z+1)) alpha.zt[z,t]<-alpha[z]*alpha[t] ## 2. Simulate data from the true hazard (Aalen multiplicative model) N<-10000 # sample size set.seed(1) ## simulate new arrivals Ei.new<-hist(sort(runif(n=N,max=M)),breaks=0:M,plot=FALSE)$counts ## simulate matrices of exposure (e.zt) and occurrences (o.zt) o.zt<-e.zt<-matrix(0,M,M) e.zt[,1]<-as.integer(Ei.new) for(z in 1:M) for (t in 1:(M-z+1)){ if(e.zt[z,t]>0) o.zt[z,t]<-rbinom(1,as.integer(e.zt[z,t]),alpha.zt[z,t]) else o.zt[z,t]<-0 if(t<(M-z+1)) e.zt[z,(t+1)]<-e.zt[z,t]-o.zt[z,t] } ## 3. Estimate the 2d-hazard with fixed bandwidth bs.grid<-t(c(M/2,M/2)) alpha.estim<-hazard2D(1:M,1:M,o.zt,e.zt,bs.grid,cv=FALSE) ## 4. Compare true and estimated hazards persp(1:M,1:M,alpha.zt,main='True hazard',theta=60,xlab='z',ylab='t',zlab='') persp(1:M,1:M,alpha.estim$hi.zt,main='Estimated hazard',theta=60, xlab='z',ylab='t',zlab='')
Local linear estimator of the marker-dependent hazard in the case of missing-survival-link data. Hazard is assumed to have two dimensions: a one-dimensional marker (typically the notification date) and time (duration). It is assumed the situation of observing aggregated data in the form of occurrences and exposures. The missing-survival link problem means that the duration is not directly observed. The estimator follows from an iterative algorithm where, at each step, full information including duration is estimated first and then, the local linear estimator is computed evaluating the function hazard2D
.
hazard2Dmiss(t.grid, z.grid, Oi1.z, Oi2.z, Ei.z, bs.grid, cv=TRUE, epsilon=1e-4, max.ite=50)
hazard2Dmiss(t.grid, z.grid, Oi1.z, Oi2.z, Ei.z, bs.grid, cv=TRUE, epsilon=1e-4, max.ite=50)
t.grid |
a vector of |
z.grid |
a vector of |
Oi1.z |
a vector of length |
Oi2.z |
a vector of length |
Ei.z |
a vector of length |
bs.grid |
a matrix with a grid of 2-dimensional bandwidths (by rows). |
cv |
logical, if |
epsilon |
a numeric value with the tolerance in the iterative algorithm. Default value is |
max.ite |
a integer value with the maximum number of iterations in the iterative algorithm. Default value is |
Hazard is assumed having two dimensions: a one-dimensional marker and time. It is assumed the situation of observing aggregated data in the form of occurrences and exposure, as the function hazard2D
does. The difference is that the time dimension (duration) is not observed. The estimator follows from an iterative algorithm where, at each step, full information, estimating the duration, is constructed first and then the local linear estimator is computed by evaluating the function hazard2D
.
Missing-link-survival data means that duration is not observed directly. Each day (z), we get information on number of people remaining in hospital (exposure, Ei.z), the number of deaths (Oi1.z) and the number of recoveries (Oi2.z) on that day.
hi.zt |
a ( |
hi1.zt |
a ( |
hi2.zt |
a ( |
bcv |
a two-dimensional vector with the bandwidth used to compute the estimator (estimated by cross-validation if |
tol |
a numeric value with the achieved tolerance value in the algorithm. |
it |
an integer with the number of iterations performed in the algorithm. |
M.L. Gámiz, E. Mammen, M.D. Martínez-Miranda and J.P. Nielsen.
Gámiz, M.L., Mammen, E., Martínez-Miranda, M.D. and Nielsen, J.P. (2022). Missing link survival analysis with applications to available pandemic data. Computational Statistics & Data Analysis, 169, 107405.
Gámiz, M.L., Mammen, E., Martínez-Miranda, M.D. and Nielsen, J.P. (2024a). Low quality exposure and point processes with a view to the first phase of a pandemic. arXiv:2308.09918.
Gámiz, M.L., Mammen, E., Martínez-Miranda, M.D. and Nielsen, J.P. (2024b). Monitoring a developing pandemic with available data. arXiv:2308.09919.
## Survival analysis of duration in covid-19 data data('covid') Ei.z<-covid$Hospi # exposure for survival analysis Oi1.z<-covid$Death # deaths Oi2.z<-covid$Recov # recoveries ## compute incremental values Oi1.z<-diff(Oi1.z) Oi2.z<-diff(Oi2.z) Ei.z<-Ei.z[-1] # exposure is cumulative M<-length(Ei.z) t.grid<-z.grid<-1:M ## notification date (marker) ddates<-covid$Date ## Hazard estimate with a fixed bandwidth bs<-t(c(150,150)) res.h<-hazard2Dmiss(t.grid,z.grid,Oi1.z,Oi2.z,Ei.z,bs.grid=bs,cv=FALSE) hi1.zt<-res.h$hi1.zt ## 2D-hazard of recoveries hi2.zt<-res.h$hi2.zt ## 2D-hazard of deaths ## Plot of hazard of deaths on several dates z1<-c(13,44,105,197) zdates<-ddates[z1] ; nz<-length(z1) t.min<-35 # consider a maximum duration of 35 days for the plots ti<-1:t.min ; n0<-length(ti) yy<-range(hi1.zt[z1,1:n0],na.rm=TRUE) yy[1]<-yy[1]-.0003;yy[2]<-yy[2]+.03 plot(ti,hi1.zt[nz,1:n0], main='Deaths',type='l', xlab='Time (days) from admission',ylab='Hazard',ylim=yy) for(i in 2:nz) lines(ti,hi1.zt[z1[i-1],1:n0],lwd=3,col=i,lty=i) legend('topright',c('Date of admision', as.character(zdates)), lty=c(NA,2:nz,1),lwd=c(NA,rep(3,nz-1),2),col=c(NA,2:nz,1),bty='n') ## Same for recoveries yy<-range(hi2.zt[z1,1:n0],na.rm=TRUE) yy[2]<-yy[2]+.05 plot(ti,hi2.zt[nz,1:n0], main='Recoveries',type='l', xlab='Time (days) from admission',ylab='Hazard',ylim=yy) for(i in 2:nz) lines(ti,hi2.zt[z1[i-1],1:n0],lwd=3,col=i,lty=i) legend('topright',c('Date of admision',as.character(zdates)), lty=c(NA,2:nz,1),lwd=c(NA,rep(3,nz-1),2),col=c(NA,2:nz,1),bty='n')
## Survival analysis of duration in covid-19 data data('covid') Ei.z<-covid$Hospi # exposure for survival analysis Oi1.z<-covid$Death # deaths Oi2.z<-covid$Recov # recoveries ## compute incremental values Oi1.z<-diff(Oi1.z) Oi2.z<-diff(Oi2.z) Ei.z<-Ei.z[-1] # exposure is cumulative M<-length(Ei.z) t.grid<-z.grid<-1:M ## notification date (marker) ddates<-covid$Date ## Hazard estimate with a fixed bandwidth bs<-t(c(150,150)) res.h<-hazard2Dmiss(t.grid,z.grid,Oi1.z,Oi2.z,Ei.z,bs.grid=bs,cv=FALSE) hi1.zt<-res.h$hi1.zt ## 2D-hazard of recoveries hi2.zt<-res.h$hi2.zt ## 2D-hazard of deaths ## Plot of hazard of deaths on several dates z1<-c(13,44,105,197) zdates<-ddates[z1] ; nz<-length(z1) t.min<-35 # consider a maximum duration of 35 days for the plots ti<-1:t.min ; n0<-length(ti) yy<-range(hi1.zt[z1,1:n0],na.rm=TRUE) yy[1]<-yy[1]-.0003;yy[2]<-yy[2]+.03 plot(ti,hi1.zt[nz,1:n0], main='Deaths',type='l', xlab='Time (days) from admission',ylab='Hazard',ylim=yy) for(i in 2:nz) lines(ti,hi1.zt[z1[i-1],1:n0],lwd=3,col=i,lty=i) legend('topright',c('Date of admision', as.character(zdates)), lty=c(NA,2:nz,1),lwd=c(NA,rep(3,nz-1),2),col=c(NA,2:nz,1),bty='n') ## Same for recoveries yy<-range(hi2.zt[z1,1:n0],na.rm=TRUE) yy[2]<-yy[2]+.05 plot(ti,hi2.zt[nz,1:n0], main='Recoveries',type='l', xlab='Time (days) from admission',ylab='Hazard',ylim=yy) for(i in 2:nz) lines(ti,hi2.zt[z1[i-1],1:n0],lwd=3,col=i,lty=i) legend('topright',c('Date of admision',as.character(zdates)), lty=c(NA,2:nz,1),lwd=c(NA,rep(3,nz-1),2),col=c(NA,2:nz,1),bty='n')
From the two-dimensional estimated hazard of death/recovery, the median of the time spent in hospital is computed depending on the date of admission.
medtime(hi.zt,z1)
medtime(hi.zt,z1)
hi.zt |
a matrix with the estimated hazard of death+recovery ( |
z1 |
(optional) a vector of indexes between 1 and |
A vector with the computed median times for each day in z1
.
Evaluate the function hazard2Dmiss
to create the estimated hazard for the argument hi.zt
.
M.L. Gámiz, E. Mammen, M.D. Martínez-Miranda and J.P. Nielsen.
Gámiz, M.L., Mammen, E., Martínez-Miranda, M.D. and Nielsen, J.P. (2024b). Monitoring a developing pandemic with available data. arXiv:2308.09919.
data('covid') Ei.z<-covid$Hospi # exposure for survival analysis Oi1.z<-covid$Death # deaths Oi2.z<-covid$Recov # recoveries # compute incremental values Oi1.z<-diff(Oi1.z) Oi2.z<-diff(Oi2.z) Ei.z<-Ei.z[-1] # exposure is left as cumulative M<-length(Ei.z) t.grid<-z.grid<-1:M # notification date (marker) ddates<-covid$Date ## First compute the estimated hazard bs<-t(c(150,150)) res.h<-hazard2Dmiss(t.grid,z.grid,Oi1.z,Oi2.z,Ei.z,bs.grid=bs,cv=FALSE) hi.zt<-res.h$hi.zt # =hi2.zt+hi1.zt (two possible outcomes) ## Now the median time at few values of the marker (admission dates) z1<-c(seq(1,M-1,by=30),M-1) nz<-length(z1) res<-medtime(hi.zt,z1) plot(z1,res,ylab='days',xaxt = "n",type='p',pch=16, xlim=range(z1), xlab='Date of admission', main='Median time from admission to exit (death+recovery)') axis(1,at=z1,labels=ddates[z1],cex=1.2)
data('covid') Ei.z<-covid$Hospi # exposure for survival analysis Oi1.z<-covid$Death # deaths Oi2.z<-covid$Recov # recoveries # compute incremental values Oi1.z<-diff(Oi1.z) Oi2.z<-diff(Oi2.z) Ei.z<-Ei.z[-1] # exposure is left as cumulative M<-length(Ei.z) t.grid<-z.grid<-1:M # notification date (marker) ddates<-covid$Date ## First compute the estimated hazard bs<-t(c(150,150)) res.h<-hazard2Dmiss(t.grid,z.grid,Oi1.z,Oi2.z,Ei.z,bs.grid=bs,cv=FALSE) hi.zt<-res.h$hi.zt # =hi2.zt+hi1.zt (two possible outcomes) ## Now the median time at few values of the marker (admission dates) z1<-c(seq(1,M-1,by=30),M-1) nz<-length(z1) res<-medtime(hi.zt,z1) plot(z1,res,ylab='days',xaxt = "n",type='p',pch=16, xlim=range(z1), xlab='Date of admission', main='Median time from admission to exit (death+recovery)') axis(1,at=z1,labels=ddates[z1],cex=1.2)
From the two-dimensional estimated hazards of deaths and recoveries, the probability that a person, who has been in hospital for a number of days, leaves the hospital alive or death, depending on the date of admission.
poutcome(hi1.zt,hi2.zt,z1)
poutcome(hi1.zt,hi2.zt,z1)
hi1.zt |
a matrix with the estimated hazard of deaths ( |
hi2.zt |
a matrix with the estimated hazard of recoveries ( |
z1 |
(optional) a vector of indexes between 1 and |
alive.zt |
a matrix ( |
death.zt |
a matrix ( |
Evaluate the function hazard2Dmiss
to create the estimated hazards for the arguments hi1.zt
and hi2.zt
.
M.L. Gámiz, E. Mammen, M.D. Martínez-Miranda and J.P. Nielsen.
Gámiz, M.L., Mammen, E., Martínez-Miranda, M.D. and Nielsen, J.P. (2024b). Monitoring a developing pandemic with available data. arXiv:2308.09919.
data('covid') Ei.z<-covid$Hospi # exposure for survival analysis Oi1.z<-covid$Death # deaths Oi2.z<-covid$Recov # recoveries # compute incremental values Oi1.z<-diff(Oi1.z) Oi2.z<-diff(Oi2.z) Ei.z<-Ei.z[-1] # exposure is left as cumulative M<-length(Ei.z) t.grid<-z.grid<-1:M # notification date (marker) ddates<-covid$Date ## First compute the estimated hazard bs<-t(c(150,150)) res.h<-hazard2Dmiss(t.grid,z.grid,Oi1.z,Oi2.z,Ei.z,bs.grid=bs,cv=FALSE) hi1.zt<-res.h$hi1.zt # 2D-hazard of deaths hi2.zt<-res.h$hi2.zt # 2D-hazard of recoveries ## Now the probabilities at few values of the marker (admission dates) z1<-c(13,44,105,197) zdates<-ddates[z1] nz<-length(z1) t.min<-35 # maximum duration is 35 ti<-1:t.min;n0<-length(ti) res<-poutcome(hi1.zt,hi2.zt,z1) alive.zt<-res$alive.zt death.zt<-res$death.zt # Cause= recovery plot(ti,alive.zt[1:n0,1],ylim=c(0,1),lwd=2,type='l', main='Probability to get out alive', ylab='',xlab='Time from admission (days)') for(i in 2:nz) lines(ti,alive.zt[1:n0,i-1],lwd=3,col=i,lty=i) legend('bottom',legend=zdates,lty=c(2:nz,1), lwd=c(rep(3,nz-1),2),col=c(2:nz,1),bty='n') # Cause= death plot(ti,death.zt[1:n0,1],ylim=c(0,1),lwd=2,type='l', main='Probability of dying in the hospital', ylab='',xlab='Time from admission (days)') for(i in 2:nz) lines(ti,death.zt[1:n0,i-1],lwd=3,col=i,lty=i) legend('top',legend=zdates,lty=c(2:nz,1),lwd=c(rep(3,nz-1),2), col=c(2:nz,1),bty='n')
data('covid') Ei.z<-covid$Hospi # exposure for survival analysis Oi1.z<-covid$Death # deaths Oi2.z<-covid$Recov # recoveries # compute incremental values Oi1.z<-diff(Oi1.z) Oi2.z<-diff(Oi2.z) Ei.z<-Ei.z[-1] # exposure is left as cumulative M<-length(Ei.z) t.grid<-z.grid<-1:M # notification date (marker) ddates<-covid$Date ## First compute the estimated hazard bs<-t(c(150,150)) res.h<-hazard2Dmiss(t.grid,z.grid,Oi1.z,Oi2.z,Ei.z,bs.grid=bs,cv=FALSE) hi1.zt<-res.h$hi1.zt # 2D-hazard of deaths hi2.zt<-res.h$hi2.zt # 2D-hazard of recoveries ## Now the probabilities at few values of the marker (admission dates) z1<-c(13,44,105,197) zdates<-ddates[z1] nz<-length(z1) t.min<-35 # maximum duration is 35 ti<-1:t.min;n0<-length(ti) res<-poutcome(hi1.zt,hi2.zt,z1) alive.zt<-res$alive.zt death.zt<-res$death.zt # Cause= recovery plot(ti,alive.zt[1:n0,1],ylim=c(0,1),lwd=2,type='l', main='Probability to get out alive', ylab='',xlab='Time from admission (days)') for(i in 2:nz) lines(ti,alive.zt[1:n0,i-1],lwd=3,col=i,lty=i) legend('bottom',legend=zdates,lty=c(2:nz,1), lwd=c(rep(3,nz-1),2),col=c(2:nz,1),bty='n') # Cause= death plot(ti,death.zt[1:n0,1],ylim=c(0,1),lwd=2,type='l', main='Probability of dying in the hospital', ylab='',xlab='Time from admission (days)') for(i in 2:nz) lines(ti,death.zt[1:n0,i-1],lwd=3,col=i,lty=i) legend('top',legend=zdates,lty=c(2:nz,1),lwd=c(rep(3,nz-1),2), col=c(2:nz,1),bty='n')
Local linear estimator of the infection or the hospitalization rate in the case of missing-survival-link data (Gámiz et al. 2024a). The rate is assumed to have two dimensions: a one-dimensional marker (typically the notification date) and time (duration). It is assumed the situation of observing aggregated data in the form of occurrences and exposures. The missing-survival link problem means that the duration is not directly observed. The estimator follows from an iterative algorithm where, at each step, full information including duration is estimated first and then, the local linear estimator is computed evaluating the function hazard2D
.
rate2Dmiss(t.grid, z.grid, Oi.z, Ei.z1, bs.grid, cv=TRUE, epsilon=1e-4, max.ite=50)
rate2Dmiss(t.grid, z.grid, Oi.z, Ei.z1, bs.grid, cv=TRUE, epsilon=1e-4, max.ite=50)
t.grid |
a vector of |
z.grid |
a vector of |
Oi.z |
a vector of length |
Ei.z1 |
a vector of length |
bs.grid |
a matrix with a grid of 2-dimensional bandwidths (by rows). |
cv |
logical, if |
epsilon |
a numeric value with the tolerance in the iterative algorithm. Default value is |
max.ite |
an integer value with the maximum number of iterations in the iterative algorithm. Default value is |
Hazard is assumed having two dimensions: a one-dimensional marker (typically the notification date) and time (duration). It is assumed the situation of observing aggregated data in the form of occurrences and exposure, as the function hazard2D
does. The difference is that the time dimension (duration) is not directly observed. The estimator follows from an iterative algorithm where, at each step, full information, estimating the duration, is constructed first and then the local linear estimator is computed evaluating the function hazard2D
. See more details on the local linear hazard estimator in the documentation of hazard2D
.
To estimate the infection rate we assume that each day (z), we have information on the number of people tested positive. The vector of occurrences (Oi.z
) is the observed number of people tested positive each day, removing the first d
days. Here d
is typically 1 (an infected person might infect one day after being tested positive), see the examples below. The vector of exposure (Ei.z
) is the observed number of people tested positive each day, removing the last d
days.
To estimate the hospitalization rate we assume that each day (z), we have information on the number of new hospitalizations (Oi.z
), and the number of people tested positive (Ei.z
) each day.
hi.zt |
a ( |
bcv |
a two-dimensional vector with the bandwidth used to compute the estimator (estimated by cross-validation if |
tol |
a numeric value with the achieved tolerance value in the algorithm. |
it |
an integer with the number of iterations performed in the algorithm. |
M.L. Gámiz, E. Mammen, M.D. Martínez-Miranda and J.P. Nielsen.
Gámiz, M.L., Mammen, E., Martínez-Miranda, M.D. and Nielsen, J.P. (2024a). Low quality exposure and point processes with a view to the first phase of a pandemic. arXiv:2308.09918.
Gámiz, M.L., Mammen, E., Martínez-Miranda, M.D. and Nielsen, J.P. (2024b). Monitoring a developing pandemic with available data. arXiv:2308.09919.
## Analysis of the infection and hospitalization processes ## data are (cumulative) number of hospitalizations and positive tested data('covid') ## We remove the first 56 rows (no data on testing until 13th May) covid2<-covid[-c(1:56),] M2<-nrow(covid2) ## 1. Rate of infection Ei.new<-covid2$Posit delay<-1;M2<-M2-delay Oi.z<-Ei.new[-(1:delay)]; Ei.z1<-Ei.new[1:M2] t.grid<-z.grid<-1:M2 bs<-t(c(5,10)) RInf<-rate2Dmiss(t.grid,z.grid,Oi.z,Ei.z1,bs.grid=bs,cv=FALSE, epsilon=1e-4,max.ite=50) hi.zt<-RInf$hi.zt # the estimated infection rate ## Plot the estimated infection rate at few notification days ddates<-covid2$Date z1<-c(19,49,80,141) nz<-length(z1) zdates<-ddates[z1] t.min<-min(M2-z1+1) ti<-1:t.min;n0<-length(ti) ## displaced curves alphas.I<-matrix(NA,M2,M2) # upper-triangular matrix for(j in 1:M2) alphas.I[j,j:M2]<-hi.zt[j,1:(M2-j+1)] alphas<-alphas.I[z1,-(1:18)] M2a<-ncol(alphas) yy<-c(0,max(alphas,na.rm=TRUE)+0.1) plot(1:M2a,alphas[nz,],type='l',ylab='',xlab='Date of notification', main= 'Dynamic rate of infection',ylim=yy,lwd=2,lty=1,xaxt='n') axis(1,at=z1-18,labels=c(zdates)) for (i in 2:nz) lines(1:M2a,alphas[i-1,],lwd=3,col=i,lty=i) legend('topleft',c('Starting on date:',as.character(zdates)), lty=c(NA,2:nz,1),lwd=c(NA,rep(3,nz-1),2),col=c(NA,2:nz,1),bty='n') ## 2. Rate of hospitalization Hi<-covid2$Hospi ; Hi<-Hi[-1] Ri<-covid2$Recov ; Ri<-diff(Ri) Di<-covid2$Death ; Di<-diff(Di) M2<-length(Di) ## New hospitalizations are Hi-Ri-Di newHi<-Hi[-1]-(Hi[-M2]-Ri[-M2]-Di[-M2]) newHi<-c(Hi[1],newHi) newHi[newHi<0]<-0; # possible inconsistency in the data Oi.z<-as.integer(newHi) Ei.z1<-covid2$Posit t.grid<-z.grid<-1:M2 bs<-t(c(20,10)) RHosp<-rate2Dmiss(t.grid,z.grid,Oi.z,Ei.z1,bs.grid=bs,cv=FALSE, epsilon=1e-4,max.ite=50) hi.zt<-RHosp$hi.zt # the estimated rate ## Plot the estimated rate at few notification days z1<-c(19,49,80,141) nz<-length(z1) zdates<-ddates[z1] t.min<-min(M2-z1+1) ti<-1:t.min;n0<-length(ti) ## displaced curves alphas.I<-matrix(NA,M2,M2) # upper-triangular matrix for(j in 1:M2) alphas.I[j,j:M2]<-hi.zt[j,1:(M2-j+1)] alphas<-alphas.I[z1,-(1:18)] M2a<-ncol(alphas) yy<-c(0,max(alphas,na.rm=TRUE)+0.01) plot(1:M2a,alphas[nz,],type='l',ylab='',xlab='Date of notification', main= 'Dynamic rate of hospitalization',ylim=yy,lwd=2,xaxt='n') axis(1,at=z1-18,labels=c(zdates)) for (i in 2:nz) lines(1:M2a,alphas[i-1,],lwd=3,col=i,lty=i) legend('topright',c('Starting on date:',as.character(zdates)), lty=c(NA,2:nz,1),lwd=c(NA,rep(3,nz-1),2),col=c(NA,2:nz,1),bty='n')
## Analysis of the infection and hospitalization processes ## data are (cumulative) number of hospitalizations and positive tested data('covid') ## We remove the first 56 rows (no data on testing until 13th May) covid2<-covid[-c(1:56),] M2<-nrow(covid2) ## 1. Rate of infection Ei.new<-covid2$Posit delay<-1;M2<-M2-delay Oi.z<-Ei.new[-(1:delay)]; Ei.z1<-Ei.new[1:M2] t.grid<-z.grid<-1:M2 bs<-t(c(5,10)) RInf<-rate2Dmiss(t.grid,z.grid,Oi.z,Ei.z1,bs.grid=bs,cv=FALSE, epsilon=1e-4,max.ite=50) hi.zt<-RInf$hi.zt # the estimated infection rate ## Plot the estimated infection rate at few notification days ddates<-covid2$Date z1<-c(19,49,80,141) nz<-length(z1) zdates<-ddates[z1] t.min<-min(M2-z1+1) ti<-1:t.min;n0<-length(ti) ## displaced curves alphas.I<-matrix(NA,M2,M2) # upper-triangular matrix for(j in 1:M2) alphas.I[j,j:M2]<-hi.zt[j,1:(M2-j+1)] alphas<-alphas.I[z1,-(1:18)] M2a<-ncol(alphas) yy<-c(0,max(alphas,na.rm=TRUE)+0.1) plot(1:M2a,alphas[nz,],type='l',ylab='',xlab='Date of notification', main= 'Dynamic rate of infection',ylim=yy,lwd=2,lty=1,xaxt='n') axis(1,at=z1-18,labels=c(zdates)) for (i in 2:nz) lines(1:M2a,alphas[i-1,],lwd=3,col=i,lty=i) legend('topleft',c('Starting on date:',as.character(zdates)), lty=c(NA,2:nz,1),lwd=c(NA,rep(3,nz-1),2),col=c(NA,2:nz,1),bty='n') ## 2. Rate of hospitalization Hi<-covid2$Hospi ; Hi<-Hi[-1] Ri<-covid2$Recov ; Ri<-diff(Ri) Di<-covid2$Death ; Di<-diff(Di) M2<-length(Di) ## New hospitalizations are Hi-Ri-Di newHi<-Hi[-1]-(Hi[-M2]-Ri[-M2]-Di[-M2]) newHi<-c(Hi[1],newHi) newHi[newHi<0]<-0; # possible inconsistency in the data Oi.z<-as.integer(newHi) Ei.z1<-covid2$Posit t.grid<-z.grid<-1:M2 bs<-t(c(20,10)) RHosp<-rate2Dmiss(t.grid,z.grid,Oi.z,Ei.z1,bs.grid=bs,cv=FALSE, epsilon=1e-4,max.ite=50) hi.zt<-RHosp$hi.zt # the estimated rate ## Plot the estimated rate at few notification days z1<-c(19,49,80,141) nz<-length(z1) zdates<-ddates[z1] t.min<-min(M2-z1+1) ti<-1:t.min;n0<-length(ti) ## displaced curves alphas.I<-matrix(NA,M2,M2) # upper-triangular matrix for(j in 1:M2) alphas.I[j,j:M2]<-hi.zt[j,1:(M2-j+1)] alphas<-alphas.I[z1,-(1:18)] M2a<-ncol(alphas) yy<-c(0,max(alphas,na.rm=TRUE)+0.01) plot(1:M2a,alphas[nz,],type='l',ylab='',xlab='Date of notification', main= 'Dynamic rate of hospitalization',ylim=yy,lwd=2,xaxt='n') axis(1,at=z1-18,labels=c(zdates)) for (i in 2:nz) lines(1:M2a,alphas[i-1,],lwd=3,col=i,lty=i) legend('topright',c('Starting on date:',as.character(zdates)), lty=c(NA,2:nz,1),lwd=c(NA,rep(3,nz-1),2),col=c(NA,2:nz,1),bty='n')