`BinomDiffCI.Rd`

Several confidence intervals for the difference between proportions are available, but they can produce markedly different results. Traditional approaches, such as the Wald interval do not perform well unless the sample size is large. Better intervals are available. These include the Agresti/Caffo method (2000), Newcombe Score method (1998) and more computing intensive ones as by Miettinen and Nurminen (1985) or Mee (1984). The latter ones are favoured by Newcombe (when forced to choose between a rock and a hard place).

- x1
number of successes for the first group.

- n1
number of trials for the first group.

- x2
number of successes for the second group.

- n2
number of trials for the second group.

- conf.level
confidence level, defaults to 0.95.

- sides
a character string specifying the side of the confidence interval, must be one of

`"two.sided"`

(default),`"left"`

or`"right"`

. You can specify just the initial letter.`"left"`

would be analogue to a hypothesis of`"greater"`

in a`t.test`

.- method
one of

`"wald"`

,`"waldcc"`

,`"ac"`

,`"score"`

,`"scorecc"`

,`"mn"`

,`"mee"`

,`"blj"`

,`"ha"`

,`"hal"`

,`"jp"`

.

All arguments are being recycled.

We estimate the difference between proportions using the sample proportions: $$\hat{\delta} =\hat{p}_1 - \hat{p}_2 = \frac{x_1}{n_1} - \frac{x_2}{n_2}$$

The traditional **Wald ** confidence interval for the difference of two proportions \(\delta\) is based on the asymptotic normal distribution of \(\hat{\delta}\).

The **Corrected Wald** interval uses a continuity correction included in the test statistic. The continuity correction is subtracted from the numerator of the test statistic if the numerator is greater than zero; otherwise, the continuity correction is added to the numerator. The value of the continuity correction is (1/n1 + 1/n2)/2.

The **Agresti-Caffo** (code `"ac"`

) is equal to the Wald interval with the adjustment according to Agresti, Caffo (2000) for difference in proportions and independent samples. It adds 1 to x1 and x2 and adds 2 to n1 and n2 and performs surpringly well.

**Newcombe** (code `"scorecc"`

) proposed a confidence interval for the difference based on the Wilson score confidence interval for a single proportion. A variant uses a continuity correction for the Wilson interval (code `"scorecc"`

).

**Miettinen and Nurminen** showed that the restricted maximum likelihood estimates for p1 and p2 can
be obtained by solving a cubic equation and gave unique closed-form expressions for them. The Miettinen-Nurminen confidence interval is returned with code `"mn"`

.

The **Mee** (code `"mee"`

) interval proposed by Mee (1984) and Farrington-Manning (1990) is using the same maximum likelihood estimators as Miettinen-Nurminen but with another correcting factor.

The **Brown, Li's Jeffreys** (code `"blj"`

) interval was proposed by Brown, Li's Jeffreys (2005).

The **Hauck-Anderson** (code `"ha"`

) interval was proposed by Hauck-Anderson (1986).

The **Haldane** (code `"hal"`

) interval is described in Newcombe (1998) and so is
the **Jeffreys-Perks** (code `"jp"`

).

Some approaches for the confidence intervals can potentially yield negative results or values beyond [-1, 1]. These would be reset such as not to exceed the range of [-1, 1].

Which of the methods to use is currently still the subject of lively discussion and has not yet been conclusively clarified. See e.g. Fagerland (2011).

The general consensus is that the most widely taught method `method="wald"`

is inappropriate in many situations and should not be used. Recommendations seem to converge around the Miettinen-Nurminen based methods (`method="mn"`

).

A matrix with 3 columns containing the estimate, the lower and the upper confidence intervall.

Agresti, A, Caffo, B (2000) Simple and effective confidence intervals for proportions and difference of proportions result from adding two successes and two failures. *The American Statistician* 54 (4), 280-288.

Beal, S L (1987) Asymptotic Confidence Intervals for the Difference Between Two Binomial Parameters for Use with Small Samples; *Biometrics*, 43, 941-950.

Brown L, Li X (2005) Confidence intervals for two sample binomial distribution, *Journal of Statistical Planning and Inference*, 130(1), 359-375.

Hauck WW, Anderson S. (1986) A comparison of large-sample confidence interval methods for the difference of two binomial probabilities *The American Statistician* 40(4): 318-322.

Farrington, C. P. and Manning, G. (1990) Test Statistics and Sample Size Formulae for Comparative Binomial Trials with Null Hypothesis of Non-zero Risk Difference or Non-unity Relative Risk *Statistics in Medicine*, 9, 1447-1454.

Mee RW (1984) Confidence bounds for the difference between two probabilities, *Biometrics* 40:1175-1176 .

Miettinen OS, Nurminen M. (1985) Comparative analysis of two rates. *Statistics in Medicine* 4, 213-226.

Newcombe, R G (1998). Interval Estimation for the Difference Between Independent Proportions: Comparison of Eleven Methods. *Statistics in Medicine*, 17, 873–890.

Fagerland M W, Lydersen S and Laake P (2011) Recommended confidence intervals for two independent binomial proportions, *Statistical Methods in Medical Research* 0(0) 1-31

```
x1 <- 56; n1 <- 70; x2 <- 48; n2 <- 80
xci <- BinomDiffCI(x1, n1, x2, n2, method=c("wald", "waldcc", "ac", "score",
"scorecc", "mn", "mee", "blj", "ha"))
Format(xci[,-1], digits=4)
#> lwr.ci upr.ci
#> wald 0.0575 0.3425
#> waldcc 0.0441 0.3559
#> ac 0.0525 0.3358
#> score 0.0524 0.3339
#> scorecc 0.0428 0.3422
#> mn 0.0528 0.3382
#> mee 0.0534 0.3377
#> blj 0.0540 0.3400
#> ha 0.0494 0.3506
x1 <- 9; n1 <- 10; x2 <- 3; n2 <- 10
yci <- BinomDiffCI(x1, n1, x2, n2, method=c("wald", "waldcc", "ac", "score",
"scorecc", "mn", "mee", "blj", "ha"))
Format(yci[, -1], digits=4)
#> lwr.ci upr.ci
#> wald 0.2605 0.9395
#> waldcc 0.1605 1.0000
#> ac 0.1600 0.8400
#> score 0.1705 0.8090
#> scorecc 0.1013 0.8387
#> mn 0.1700 0.8406
#> mee 0.1821 0.8370
#> blj 0.1869 0.9040
#> ha 0.1922 1.0000
# https://www.lexjansen.com/wuss/2016/127_Final_Paper_PDF.pdf, page 9
SetNames(round(
BinomDiffCI(56, 70, 48, 80,
method=c("wald", "waldcc", "hal",
"jp", "mee",
"mn", "score", "scorecc",
"ha", "ac", "blj"))[,-1], 4),
rownames=c("1. Wald, no CC", "2. Wald, CC", "3. Haldane", "4. Jeffreys-Perks",
"5. Mee", "6. Miettinen-Nurminen", "10. Score, no CC", "11. Score, CC",
"12. Hauck-Andersen", "13. Agresti-Caffo", "16. Brown-Li"))
#> lwr.ci upr.ci
#> 1. Wald, no CC 0.0575 0.3425
#> 2. Wald, CC 0.0441 0.3559
#> 3. Haldane 0.0535 0.3351
#> 4. Jeffreys-Perks 0.0531 0.3355
#> 5. Mee 0.0534 0.3377
#> 6. Miettinen-Nurminen 0.0528 0.3382
#> 10. Score, no CC 0.0524 0.3339
#> 11. Score, CC 0.0428 0.3422
#> 12. Hauck-Andersen 0.0494 0.3506
#> 13. Agresti-Caffo 0.0525 0.3358
#> 16. Brown-Li 0.0540 0.3400
```