--- title: "Poisson Two-Sided Confidence Intervals" author: "Michael Fay" date: "June 21, 2021" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Poisson Two-Sided Confidence Intervals} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include=FALSE} knitr::opts_chunk$set(echo=FALSE, collapse = TRUE, comment = "#>" ) ``` ```{r} library(exactci) ``` # 1. Calculation of Two-sided Poisson Confidence Intervals Klaschka and Reiczigel (2020) gave better algorithms for calculating the Blaker and minlike (i.e., Sterne) two-sided confidence intervals. They focus on the discontinuity points. Borrowing from those some of the main ideas in that paper, but using different notation, we create an algorithm to calculate the $100(1-\alpha)$% two-sided Poisson confidence intervals. Checks of this algorithm show that it gets nearly identical results as Klaschka and Reiczigel (2020) in all situations that were checked. Let $X \sim Poisson(\theta)$. Let a general two-sided p-value for testing $H_0: \theta=\theta_0$ versus $H_1: \theta \neq \theta_0$ at $X=x$ be $p_T(x,\theta_0)$. Let the associated two-sided $100(1-\alpha)$% confidence interval be $$\left( L_T(x, 1-\alpha), U_T(x, 1-\alpha) \right),$$ where $$L_T(x, 1-\alpha) = \sup \left\{ \theta: \mbox{ for all } \theta_0 \leq \theta \mbox{ then } p_T(x, \theta_0) \leq \alpha \right\}$$ and $$U_T(x, 1-\alpha) = \inf \left\{ \theta: \mbox{ for all } \theta_0 \geq \theta, \mbox{ then } p_T(x, \theta_0) \leq \alpha \right\}$$. # 2. Sterne (i.e., minlike) Method ## 2.1 Overview of Algorithm Let $f(i; \theta)=Pr[ X=i | \theta]$ be the Poisson probability mass function at $X=i$. Then the 'minlike' p-value for testing $H_0: \theta=\theta_0$ is $$p_m(x, \theta_0) = \sum_{i: f(i;\theta_0) \leq f(x;\theta_0)} f(i;\theta_0).$$ We can write $p_m$ in a format that is conducive for calculating the confidence interval. First, define $$\tilde{\theta}_{a,b}$$ as the geometric mean of $a+1, a+2,\ldots, b$, where $af(x+j+1;\theta_0)$ and no other value is added to the $\bar{F}(x+j;\theta_0)$ part of the p-value until $\theta_0=\tilde{x,x+j}$. Using derivatives, we can show that during each piecewise continuous part of the p-value when $\theta_0>x$, the p-value function is increasing in $\theta_0$. Consider the piecewise continuous part with $p_m= F(a;\theta_0) + \bar{F}(b;\theta_0)$ with $af(x;\theta_0)$ so that the derivative is always positive, and the p-value is increasing. So the upper confidence limit of the $100(1-\alpha)$% minlike CI is the largest $j$ such that $p_m(x,\tilde{\theta}_{x,x+j+1})> \alpha$. For the lower confidence limit we conjecture that the piecewise continuous p-value function is monotonic within each continuous piece. ## 2.2 Compare Output to Function of Reiczigel We compare our results to those of from the function of Reiczigel as referenced in Kaschka and Reiczigel (2020). The function was accessed in May 2021 at the URL referenced in Kaschka and Reiczigel (2020); however, the website was under re-construction in June 2021. I suggest going to https://www2.univet.hu/ and following links to J. Reiczigel's homepage to find the function (or look at the Rmd file that created this vignette). ```{r} # We compare our results to those from the function by Reiczigel # referenced in Kaschka and Reiczigel (2020) # here is that function copied without changes.... ################################################################### # beginning of function by Reiczigel #################################################################### poisson.Sterne.CI = function(x,conflevel=.95,epsilon=1e-8){ # Sterne's CI for Poisson data # epsilon is used in interval halving and computing left-right limits # reiczigel.jeno@univet.hu (07.07.2019) alpha=1-conflevel lam = function(k,m) exp((lfactorial(m)-lfactorial(k))/(m-k)) # first going downwards (except if x==0) if(x==0) LCL=0 else { i=x-1 repeat{ if(i<0) L=0 else L=lam(i,x) #pvr=sterne.poi.test.orig(x,L+epsilon) ############################################ lambda0=L+epsilon threshold=dpois(x,lambda0); sum.gt=0 ii=x-1 repeat{ probab=dpois(ii,lambda0) if(probab > threshold) { sum.gt=sum.gt+probab ii=ii-1 } else break } ii=x+1 repeat{ probab=dpois(ii,lambda0) if(probab > threshold) { sum.gt=sum.gt+probab ii=ii+1 } else break } pvr=1-sum.gt ############################################ #cat(i,L,pvr,"\n") if(pvr>alpha){ i=i-1 L.previous=L pvl.previous=pvr-dpois(x,L+epsilon) } else break } #cat(" ",L.previous,pvl.previous,"\n") if(pvl.previousalpha){ right.end=midpoint } else { left.end=midpoint } if(right.end-left.end < epsilon) break } LCL=left.end } } #cat("------\n") # now going upwards i=x+1 repeat{ L=lam(x,i) #pvl=sterne.poi.test.orig(x,L-epsilon) ############################################ lambda0=L-epsilon threshold=dpois(x,lambda0); sum.gt=0 ii=x-1 repeat{ probab=dpois(ii,lambda0) if(probab > threshold) { sum.gt=sum.gt+probab ii=ii-1 } else break } ii=x+1 repeat{ probab=dpois(ii,lambda0) if(probab > threshold) { sum.gt=sum.gt+probab ii=ii+1 } else break } pvl=1-sum.gt ############################################ #cat(i,L,pvl,"\n") if(pvl>alpha){ i=i+1 L.previous=L pvr.previous=pvl-dpois(x,L-epsilon) } else break } #cat(" ",L.previous,pvr.previous,"\n") # my conjecture is that this part can be replaced simply by # UCL = lam(x,i-1) if(pvr.previous threshold) { sum.gt=sum.gt+probab ii=ii-1 } else break } ii=x+1 repeat{ probab=dpois(ii,lambda0) if(probab > threshold) { sum.gt=sum.gt+probab ii=ii+1 } else break } sterne.poi.orig=1-sum.gt ############################################ #if(sterne.poi.test.orig(x,midpoint)>alpha){ if(sterne.poi.orig>alpha){ left.end=midpoint } else { right.end=midpoint } if(right.end-left.end < epsilon) break } UCL=right.end } return(c(LCL,UCL)) } ################################################################### # end of function by Reiczigel #################################################################### X<-c(1,0,8,8,8) CL<-c(.70,.95,.99,.9,.01) checkData<- matrix(NA,2*length(X),4,dimnames=list(rep(c("Reiczigel","Fay"),length(X)), c("x","conf level","lower CL","upper CL"))) checkData[,"x"]<- rep(X,each=2) checkData[,"conf level"]<-rep(CL,each=2) for (i in 1:length(X)){ cir<-poisson.Sterne.CI(X[i], conflevel=CL[i]) checkData[2*i-1,3:4]<- cir cif<-exactpoissonCI(X[i],tsmethod="minlike",conf.level=CL[i]) checkData[2*i,3:4]<-cif } library(knitr) kable(checkData,caption="Compare Results") ``` # 3. Blaker's Method ## 3.1 Motivation for Blaker's Method Blaker's p-values are (see e.g., Klaschka and Reiczigel, 2020, Section 3.1) $$p_B(x;\theta_0) = \left\{ \begin{array}{ll} F(x; \theta_0) + \bar{F}(x_R; \theta_0) & \mbox{ if $F(x;\theta_0) < \bar{F}(x;\theta_0)$} \\ 1 & \mbox{ if $F(x;\theta_0) = \bar{F}(x;\theta_0)$} \\ F(x_L; \theta_0) + \bar{F}(x; \theta_0) & \mbox{ if $F(x;\theta_0) > \bar{F}(x;\theta_0)$} \\ \end{array} \right. ,$$ where $x_R = min \left\{ i: \bar{F}(i;\theta_0) \leq F(x; \theta_0) \right\}$ and $x_L = max \left\{ i: {F}(x;\theta_0) \leq \bar{F}(i; \theta_0) \right\}.$ This p-values are also piecewise continuous, but the jumps are at different places. Here is a comparison of the three ways of calculating the 95% two-sided p-values for $x=8$: the central method, the minlike method, and Blaker's method. ```{r, fig.width=8, fig.height=6} library(exactci) exactpoissonPlot(8,tsmethod="central", dolog=FALSE, dolines = FALSE, dopoints=TRUE,pch=16,cex=1, col="blue") exactpoissonPlot(8,tsmethod="blaker", dolines = FALSE, dopoints=TRUE, newplot=FALSE, col=gray(0.2), pch=16,cex=1.5) exactpoissonPlot(8,tsmethod="minlike", dolines = FALSE, dopoints=TRUE, newplot=FALSE, col=gray(0.8), pch=16, cex=.75) legend("topright",legend=c("central","minlike","blaker"),pch=c(16,16,16),col=c("blue",gray(.8),gray(.2))) ``` An important computational point is that Blaker's p-values are not monotonic within each piecewise constant interval. To see this, we focus in on a two pieces of the previous graph. ```{r, fig.width=8, fig.height=6} par(mfrow=c(1,2)) exactpoissonPlot(8,tsmethod="central", dolines = FALSE, dopoints=TRUE,pch=16,cex=1, col="blue",xlim=c(3.5,5.5),ylim=c(.05,.11)) exactpoissonPlot(8,tsmethod="blaker", dolines = FALSE, dopoints=TRUE, newplot=FALSE, col=gray(0.2), pch=16,cex=1.5) exactpoissonPlot(8,tsmethod="minlike", dolines = FALSE, dopoints=TRUE, newplot=FALSE, col=gray(0.8), pch=16, cex=.75) legend("topright",legend=c("central","minlike","blaker"),pch=c(16,16,16),col=c("blue",gray(.8),gray(.2))) exactpoissonPlot(8,tsmethod="central", dolines = FALSE, dopoints=TRUE,pch=16,cex=1, conf.level=1-0.0558, col="blue",xlim=c(15.4,15.6),ylim=c(.055,.058)) exactpoissonPlot(8,tsmethod="blaker", dolines = FALSE, dopoints=TRUE, newplot=FALSE, col=gray(0.2), pch=16,cex=1.5) exactpoissonPlot(8,tsmethod="minlike", dolines = FALSE, dopoints=TRUE, newplot=FALSE, col=gray(0.8), pch=16, cex=.75) legend("topleft",legend=c("central","minlike","blaker"),pch=c(16,16,16),col=c("blue",gray(.8),gray(.2))) par(mfrow=c(1,1)) ``` We conjecture that on the lower side (p-values with $\theta_0< x$) the piecewise continuous function is monotonicly increasing within each interval, while on the upper side (p-values with $\theta_0>x$) the piecewise continuous function decreases to a inflection point, then increases (see the graph). We can solve for the inflection point by taking the derivative of the p-value function. For $a