Confidence Interval for a Difference of Binomials

Several confidence intervals for the difference between proportions are available, but they can produce markedly different results. Traditional approaches, such as the Wald interval do not perform well unless the sample size is large. Better intervals are available. These include the Agresti/Caffo method (2000), Newcombe Score method (1998) and more computing intensive ones as by Miettinen and Nurminen (1985) or Mee (1984). The latter ones are favoured by Newcombe (when forced to choose between a rock and a hard place).

BinomDiffCI(x1, n1, x2, n2, conf.level = 0.95, sides = c("two.sided","left","right"),
            method = c("ac", "wald", "waldcc", "score", "scorecc", "mn",
                       "mee", "blj", "ha", "hal", "jp"))

Arguments

x1: number of successes for the first group.
n1: number of trials for the first group.
x2: number of successes for the second group.
n2: number of trials for the second group.
conf.level: confidence level, defaults to 0.95.
sides: a character string specifying the side of the confidence interval, must be one of "two.sided" (default), "left" or "right". You can specify just the initial letter. "left" would be analogue to a hypothesis of "greater" in a t.test.
method: one of "wald", "waldcc", "ac", "score", "scorecc", "mn", "mee", "blj", "ha", "hal", "jp".

Details

All arguments are being recycled.

We estimate the difference between proportions using the sample proportions: $$\hat{\delta} =\hat{p}_1 - \hat{p}_2 = \frac{x_1}{n_1} - \frac{x_2}{n_2}$$

The traditional Wald confidence interval for the difference of two proportions $\delta$ is based on the asymptotic normal distribution of $\hat{\delta}$.

The Corrected Wald interval uses a continuity correction included in the test statistic. The continuity correction is subtracted from the numerator of the test statistic if the numerator is greater than zero; otherwise, the continuity correction is added to the numerator. The value of the continuity correction is (1/n1 + 1/n2)/2.

The Agresti-Caffo (code "ac") is equal to the Wald interval with the adjustment according to Agresti, Caffo (2000) for difference in proportions and independent samples. It adds 1 to x1 and x2 and adds 2 to n1 and n2 and performs surpringly well.

Newcombe (code "scorecc") proposed a confidence interval for the difference based on the Wilson score confidence interval for a single proportion. A variant uses a continuity correction for the Wilson interval (code "scorecc").

Miettinen and Nurminen showed that the restricted maximum likelihood estimates for p1 and p2 can be obtained by solving a cubic equation and gave unique closed-form expressions for them. The Miettinen-Nurminen confidence interval is returned with code "mn".

The Mee (code "mee") interval proposed by Mee (1984) and Farrington-Manning (1990) is using the same maximum likelihood estimators as Miettinen-Nurminen but with another correcting factor.

The Brown, Li's Jeffreys (code "blj") interval was proposed by Brown, Li's Jeffreys (2005).

The Hauck-Anderson (code "ha") interval was proposed by Hauck-Anderson (1986).

The Haldane (code "hal") interval is described in Newcombe (1998) and so is the Jeffreys-Perks (code "jp").

Some approaches for the confidence intervals can potentially yield negative results or values beyond [-1, 1]. These would be reset such as not to exceed the range of [-1, 1].

Which of the methods to use is currently still the subject of lively discussion and has not yet been conclusively clarified. See e.g. Fagerland (2011).

The general consensus is that the most widely taught method method="wald" is inappropriate in many situations and should not be used. Recommendations seem to converge around the Miettinen-Nurminen based methods (method="mn").

Value

A matrix with 3 columns containing the estimate, the lower and the upper confidence intervall.

References

Agresti, A, Caffo, B (2000) Simple and effective confidence intervals for proportions and difference of proportions result from adding two successes and two failures. The American Statistician 54 (4), 280-288.

Beal, S L (1987) Asymptotic Confidence Intervals for the Difference Between Two Binomial Parameters for Use with Small Samples; Biometrics, 43, 941-950.

Brown L, Li X (2005) Confidence intervals for two sample binomial distribution, Journal of Statistical Planning and Inference, 130(1), 359-375.

Hauck WW, Anderson S. (1986) A comparison of large-sample confidence interval methods for the difference of two binomial probabilities The American Statistician 40(4): 318-322.

Farrington, C. P. and Manning, G. (1990) Test Statistics and Sample Size Formulae for Comparative Binomial Trials with Null Hypothesis of Non-zero Risk Difference or Non-unity Relative Risk Statistics in Medicine, 9, 1447-1454.

Mee RW (1984) Confidence bounds for the difference between two probabilities, Biometrics 40:1175-1176 .

Miettinen OS, Nurminen M. (1985) Comparative analysis of two rates. Statistics in Medicine 4, 213-226.

Newcombe, R G (1998). Interval Estimation for the Difference Between Independent Proportions: Comparison of Eleven Methods. Statistics in Medicine, 17, 873–890.

Fagerland M W, Lydersen S and Laake P (2011) Recommended confidence intervals for two independent binomial proportions, Statistical Methods in Medical Research 0(0) 1-31

Author

Andri Signorell <andri@signorell.net>

Examples

x1 <- 56; n1 <- 70; x2 <- 48; n2 <- 80
xci <- BinomDiffCI(x1, n1, x2, n2, method=c("wald", "waldcc", "ac", "score",
            "scorecc", "mn", "mee", "blj", "ha"))
Format(xci[,-1], digits=4)
#>         lwr.ci upr.ci
#> wald    0.0575 0.3425
#> waldcc  0.0441 0.3559
#> ac      0.0525 0.3358
#> score   0.0524 0.3339
#> scorecc 0.0428 0.3422
#> mn      0.0528 0.3382
#> mee     0.0534 0.3377
#> blj     0.0540 0.3400
#> ha      0.0494 0.3506

x1 <- 9; n1 <- 10; x2 <- 3; n2 <- 10
yci <- BinomDiffCI(x1, n1, x2, n2, method=c("wald", "waldcc", "ac", "score",
            "scorecc", "mn", "mee", "blj", "ha"))
Format(yci[, -1], digits=4)
#>         lwr.ci upr.ci
#> wald    0.2605 0.9395
#> waldcc  0.1605 1.0000
#> ac      0.1600 0.8400
#> score   0.1705 0.8090
#> scorecc 0.1013 0.8387
#> mn      0.1700 0.8406
#> mee     0.1821 0.8370
#> blj     0.1869 0.9040
#> ha      0.1922 1.0000

# https://www.lexjansen.com/wuss/2016/127_Final_Paper_PDF.pdf, page 9
SetNames(round(
  BinomDiffCI(56, 70, 48, 80, 
              method=c("wald", "waldcc", "hal", 
                       "jp", "mee",
                       "mn", "score", "scorecc", 
                       "ha", "ac", "blj"))[,-1], 4),
  rownames=c("1. Wald, no CC", "2. Wald, CC", "3. Haldane", "4. Jeffreys-Perks",
             "5. Mee", "6. Miettinen-Nurminen", "10. Score, no CC", "11. Score, CC",
             "12. Hauck-Andersen", "13. Agresti-Caffo", "16. Brown-Li"))
#>                       lwr.ci upr.ci
#> 1. Wald, no CC        0.0575 0.3425
#> 2. Wald, CC           0.0441 0.3559
#> 3. Haldane            0.0535 0.3351
#> 4. Jeffreys-Perks     0.0531 0.3355
#> 5. Mee                0.0534 0.3377
#> 6. Miettinen-Nurminen 0.0528 0.3382
#> 10. Score, no CC      0.0524 0.3339
#> 11. Score, CC         0.0428 0.3422
#> 12. Hauck-Andersen    0.0494 0.3506
#> 13. Agresti-Caffo     0.0525 0.3358
#> 16. Brown-Li          0.0540 0.3400